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ABSTRACT 

A likelihood-based method for measuring weak gravitational lensing shear in deep galaxy 
surveys is described and applied to the Canada-France-Hawaii Telescope (CFHT) Lensing 
Survey (CFHTLenS). CFHTLenS comprises 154 deg^ of multicolour optical data from the 
CFHT Legacy Survey, with lensing measurements being made in the i' band to a depth i^g < 
24.7, for galaxies with signal-to-noise ratio i^sn ^ 10. The method is based on the Zen^fit 
algorithm described in earlier papers, but here we describe a full analysis pipeline that takes 
into account the properties of real surveys. The method creates pixel-based models of the 
varying point spread function (PSF) in individual image exposures. It fits PSF-convolved two- 
component (disk plus bulge) models, to measure the ellipticity of each galaxy, with bayesian 
marginalisation over model nuisance parameters of galaxy position, size, brightness and bulge 
fraction. The method allows optimal joint measurement of multiple, dithered image exposures, 
taking into account imaging distortion and the alignment of the multiple measurements. We 
discuss the effects of noise bias on the likelihood distribution of galaxy ellipticity. Two sets 
of image simulations that mirror the observed properties of CFHTLenS have been created, to 
establish the method's accuracy and to derive an empirical correction for the effects of noise 
bias. 

Key words: cosmology: observations - gravitational lensing - methods: data analysis - meth- 
ods: statistical 
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1 INTRODUCTION 

Weak gravitational lensing allows measurement of the matter con- 
tent of the Universe, which, when used in combination with other 
probes, should enable tests of dark energy models of the Universe 
and tests of general relativity on cosmological scales. However, de- 
spite the detection of weak lensing aro und galaxy clusters more 
than 20 years ago (e.g lXvson et al.lll99(]|) and the detection of cos- 
mol ogical large -scale lensing ('cosmic shear') more than 10 years 
ago feacon et al. 2000; Van Waerbeke et al. 2000; Witt man etal] 
I2OOOI) . it has proved difficult to make measurements that are free 
from systematic error at the level required to measure cosmolog- 
ical parameters accurately. To date, imaging data obtained from 
the Hubble Space Telescope (HST) has been the most successful 
in this respect, but has yielded accurate cosmo logical results only 
relatively recently (e.g. ISchrabback et al.l2010h . Extension of these 
studies to large sky areas is currently only feasible from the ground, 
but then the weak lensing measurements need to be made using 
images of faint galaxies that are only marginally resolved and for 
which the lensing signal needs to be disentangled from the effects 
of an image PSF which varies both with position within an image 
and between images. 

Fig.[T]s hows the sizes of disk galaxies in the 'Groth Strip', as 
measured bv lSimard et d] j2002h using data from the HST Wide 
Field Planetary Camera 2 (WFPC2). These data are used in Ap- 
pendixlBl to obtain an estimate of the bayesian prior distribution 
of galaxy size used in this paper, however it is useful to show 
the distribution here to illustrate the level of difficulty imposed 
by ground-based measurements. The figure shows the fitted val- 
ues of semi-major axis disk exponential scalelength obtained for 
galaxies which had fitted bulge-to-total flux ratios (B/T) less than 
O.fl We have measured the median values of that scalelength in 
bins of i-band (WFPC2 filter F814) apparent magnitude, excluding 
values rd < 0.02", and these are shown as large filled circles in 
Fig-El In this paper, we describe how we have made weak lensing 
measurements using ground-based data from the Canada-France- 
Hawaii Telescope (CFHT) MegaCam camera (Boulade et al. 2003) 
obtained for the CFHT Legacy Survey (CFHTLS) and analysed 
as the CFHT Lensing Survey (hereafter CFHTLenS): Fig.[T]also 
shows the MegaCam typical pixel size of 0.187" and a typical 
CFHTLS PSF half-width half-maximum (HWHM) of 0.35". It can 
be seen that at the faintest magnitudes (CFHTLenS analysed galax- 
ies to i^B < 24.7) the median semi-major axis exponential scale- 
length of galaxies is comparable to the pixel size, and that very few 
galaxies have scalelengths that are larger than the PSF HWHM. 
Given that the faint galaxies are also noisy, with the faintest galax- 
ies having signal-to-noise ratio as low as 7, it is clear that highly 
accurate methods need to be adopted for measuring the systematic 
distortions of galaxies at the level of a few percent in ellipticity that 
are expected from weak lensing. 

We describe in this paper a bayesian model-fitting method 
aimed at measuring the shapes of faint galaxies in optical galaxy 
surveys. Current weak lensing surveys such as CFHTLenS com- 
prise about lO'^ galaxies, so one design aim of the algorithms is that 
they should be fast. However, as survey sizes increase, the require- 
ments on acceptable levels of non-cosmological systematics signals 
become more stringent, meaning that the new generation of mea- 

^ A population of objects that were fitted by ISimard et al.l j2002h with 
scalelengths less than 0.02" have been excluded: these very small scale- 
lengths are not correlated with apparent magnitude and are li kely fits to 
stars or are otherwise spurious measurements, as noted bv iSimard et all 
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Figure 1. Fitted disk semi-major axis scalelength, plotted as a function of 
j'-band magnitude, for galaxies identified as being disk-dominated, from 
the fits of Simard et al. ( 2002) to HST WFPC2 data. The large filled circles 
indicate the median size measured in bins of apparent magnitude. The hori- 
zontal lines indicate the CFHT MegaCam pixel size of 0.187" (lower line) 
and a typical CFHTLenS PSF half-width half maximum of 0.35" (upper 
line). 



surement algorithms must be simultaneously faster and more accu- 
rate than previous methods. The principles th at were adop t ed for 
measuring galaxy ellipticity are described bv iMiller et al. I l l200l 
hereafter Paper I), and implemented in the 'fensfit' algorithm. Tests 
on simulated images of a previous implementati on of the ba- 
sic sh ear-measurement algorithm are described bv iKitching et al.l 
l l2008i Paper II). 

This paper describes a number of significant improvements to 
the algorithm which proved necessary for working on data from a 
real survey, rather than simplified simulations. 

(i) PSF modelling took account of spatial variation of the PSF 
and possibl e discontinuities bet ween CCDs in the mosaic camera 
(MegaCam. iBoulade et al]|2003l, in the case of CFHTLenS). 

(ii) Galaxy measurement used multiple, dithered exposures, op- 
timally combined in the measurement process, taking account of 
pointing and camera distortion variations and varying image qual- 
ity. 

(iii) Camera distortion was measured directly from images and 
corrected in the measurement process. 

(iv) Fitted galaxy models allowed joint fitting of bulge and disk 
components. 

(v) Full bayesian marginalisation over nuisance parameters was 
carried out. 

(vi) Additional measures were used to account for bad and satu- 
rated pixels and columns, cosmic rays, image ghosts, blended stars 
and galaxies and recognition of morphologically complex galaxies. 

Information on these various improvements is described below. 
Section 2 describes CFHTLenS, Sections 3-6 provide details of the 
method and the shear measurement pipeline. Section 7 provides an 
overview of tests for systematic errors, which are described in more 
detail by Heymans et al. (2012), and Sections describes extensive 
image simulations that have been used to test the method and to 
calibrate the effects of noise bias. Appendices A and B provide ad- 
ditional details of the models and priors used in the method, and 
a mathematical illustration of the effect of noise bias is given in 
AppendixICl 
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2 THE CANADA-FRANCE HAWAII TELESCOPE 
LENSING SURVEY 

The data used in this paper are from the CFHT Wide Legacy 
SurvejB analysed by the CFHTLenS teanif|. The CFHTLS Wide 
consisted of 171 pointings covering a sky area of 154 deg^ in 
five b ands, u* g'r'i' z , with the MegaCam camera jBoulade et alj 
l2003h . which has a field of view of approximately 1 deg^ . The 
pointings were distributed over four separate patches on the sky, 
and each patch was contiguous, covering an area of 63.8, 22.6, 44.2 
and 23.3 deg^, respectively for the Wl, W2, W3 and W4 patches. 
Each pointing comprised typically seven i'-band exposures, often 
with many of those taken during the same night. In some fields, one 
or two exposures were judged to be of too poor quality, based on the 
seeing or apparent degree of PSF variation, reducing the number of 
usable exposures to 5 or 6. In some other fields, two entire sets of 
exposures were obtained if there was concern about the quality of 
the first set, but in a number of cases all 14 exposures were subse- 
quently judged to be usable and were included in the CFHTLenS 
analysis. Exposures were dithered by up to 90" in declination and 
24" in right ascension in patches Wl, W3 and W4, and by up to 
400" in W2. 

The CFHTLenS analysis of the data was designed and opti- 
mized for weak lensing science, both photometry and shape mea- 
surements. The photome tric a nalysis of the surve y is described by 
iHildebrandt et all ( 1201 2h and lErben et all ( l2012h . Galaxy shapes 
have been measured on the i' band which reaches a 5(t point 
source limiting magnitude of ~ 25.5. A magnitude limit of 
< 24.7 was imposed for shape measurement, noting that 
photometric re dshift accuracy also is poor at fainter magnitudes 
than this limit faildebrandt et al.ll2012h . Exposures obtained prior 
to June 2007 were obtained with the i'-band filter MP9701. Af- 
ter October 2007 an i'-band filter with a slighter different spectral 
response, MP9702, was used. 

All fields had a full-width half-maximum (FWHM) seeing 
condition which was required to be better than 0.8". Unlike pre- 
vious lensing analyses, galaxy shape measurement was performed 
on the individual exposures, photometrically calibrated with Elixir 
jMagnier & Cuillandrel2004l) . The CFHTLenS data w ere then pro- 
cessed with the THELI pipeline (see lErben et alj2005l). The specifi c 
algorithms and procedures were described bv lErben et al. I j2009l) . 
In that work, the astrometric and photometric calibration was made 
independently in each 1 deg^ MegaPrime pointing. The most im- 
portant refinement to the calibration for CFHTLenS was a simulta- 
neous treatment of all data from a CFHTLenS patch. This typically 
led to an improvement of a factor 2-3 in astrometric and photomet- 
ric accuracy compared with the earlier analysis. Fu rther de t ails of 
the construction of the survey are given by .Hevmans et al.l j2012h 
and lErben etall ( l2012h . 



3 THE SHEAR MEASUREMENT METHOD 
3.1 Principles 

Key points to emphasise about the approach of bayesian model- 
fitting are 

(i) It enables a rigorous statistical framework to be established 
for the measurement, resulting in optimally-weighted use of the 

2 www.cfht.hawaii.edu/Science/CFHTLS 
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available data, regardless of variations in signal-to-noise ratio or 
PSF 

(ii) Quantities other than ellipticity that form part of the prob- 
lem may be marginalised over and, provided an accurate prior is 
available, those uninteresting quantities may be eliminated from the 
problem without bias. The prime example is that of the unknown 
galaxy position: all other methods published to date require a fixed 
centroid position for a galaxy, and centroid err ors introduce spuri- 
ous e llipticity, with the possibility of bias (e.g. lBemstein & Jarvij 
l2002h . In model-fitting we can treat the unknown galaxy position 
as a free parameter and marginalise. 

(iii) The weak lensing signal is carried by the faintest galaxies, 
often the integrated signal-to-noise ratio may be as low as 10. A 
model-fitting approach adds extra information to the problem, such 
as prior knowledge of possible surface brightness distributions. In 
principle this leads to more precise measurements than a method 
that does not use such information, thereby allowing measurements 
to lower signal-to-noise ratio. 

(iv) The models must be unbiased, however, in the sense that 
they should span the range of surface brightness profiles of real 
galaxies, and the priors adopted for the parameters of those models 
should be an accurate reflection of the true distribution. If these 
conditions are not satisfied, the model-fitting may be systematically 
biased ( Voigt & Bridle 2010; Bernstein 2010). 

(v) A full bayesian approach should be able to correct for the 
effects of noise bias, and hence avoid the need for any indepen- 
dent calibration. We shall see in Section [33] that we have not yet 
achieved that goal. 

Point (iv) is worth considering in more detail. iBemsteii] 

has argued that model basis functions need to be truncated 
to some finite set to avoid the fitting results being dominated by 
noise, and that truncation leads to bias (see also Ngan et al. 200SJ). 
Thus, even when specific physical models are not assumed, the 
use of any so-called 'model-independent' basis set also results in 
biased results, unless the truncated basis set employed is capable 
of accurately modelling the full observed range of surface bright- 
ness profiles. In methods tha t use simple statisti cal measures of the 
data, such as momen ts (e.g iKaiser et al.iri995l . hereafter KSB, or 
lMelchioret^l20Ilh it is necessary to use weighted moments be- 
cause of image noise, and the use of a finite set of weighted mo- 
ments can lead to a bias similar to the use of truncated basis sets. 
Thus, all methods that aim to measure galaxy shapes from noisy, 
PSF-convolved data have the risk of producing biased results, and 
choosing a so-called 'model-independent' method does not allevi- 
ate this concern (see also Melchior & Viola 2012). The approach 
adopted here is that if the model basis sets are a close match to 
the true galaxy surface brightness distributions, and in particular if 
the PSF-convolved model distributions span the range of true PSF- 
convolved distributions, then it should be possible to avoid model 
truncation bias. By choosing model surface brightness distributions 
that match the bulge and disk components of galaxies at low red- 
shift, we are making a choice of a basis set that should minimise 
bias. That does leave open the question of whether such models 
can span the range of galaxy surface brightness distributions at high 
redshift, given the expected morphological evolution of the galaxy 
population, but whether or not the fitting of composite disk and 
bulge models to galaxies at high redshift results in bias has not yet 
been established. This concern should be explored in future work, 
especially given the ever-increasing accuracy required for future 
surveys. 
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The models fitted in CFHTLenS comprised two-component 
bulge plus disk galaxies (Appendix|All. There were seven free pa- 
rameters: galaxy ellipticity, ei, 62; galaxy position x, y; galaxy 
size r; galaxy flux; and bulge fraction. Ultimately we are interested 
only in the ellipticity, so the other parameters were regarded as nui- 
sance parameters and marginalised over. Bayesian marginalisation 
was used, adopting prior distributions for each parameter, which 
are described in AppendixlBl With the given priors, each parame- 
ter was marginalised over either analytically, if possible, or other- 
wise numerically. The bulge fraction marginalisation was analytic, 
galaxy flux and size were numerically marginalised over by sam- 
pling multiple values, fitting a function and integrating. Galaxy po- 
sition was numerically marginalised over using the Fourier method 
described in Paper I: the process of model-fitting is equivalent to 
cross-correlation of the model with the data, and marginalisation 
over unknown position is equivalent to integrating under the peak of 
the cross-correlation function. Thus the likelihood surface was ob- 
tained as a function of only the apparent galaxy two-component el- 
lipticity. Finally, that likelihood surface was converted to weighted 
estimates of ellipticity for use in shear measurement (Section [33] l. 
The following sections give more detail of the galaxy models and 
of the calculation of the likelihood function and its marginalisation. 

3.2 Galaxy models 

Adopting the same conventions as in Paper I, observed galaxy el- 
lipticity e is related to the intrinsic galaxy ellipticity e" by 

dSchramm & Kavsejl 19951 : ISeitz & Schneideilll997h . where e and 
g are represented as complex variables, g, g* are the reduced shear 
and its complex conjugate respectively, and e is defined in terms 
of the major and minor axes and orientation a, b, respectively, as 
e = (a — b)/{a + b) exp{2i6). In this formalism, we expect 



(2) 



for an unbiased sample with g < 1, where (e") = 0. Note 
that this result differs from the other commonly used formalism 
which instead uses ellipticity, or polarisation, defined as e = 
[a^ — b'^)/{a^ + b^) exp{2i9), and which requires measurement 
of the distribution of galaxy ellipticities in order to c alibrate the 
response to lensing shear (e.g. lBemstein & Jarvisll20o3) . 

The model-fitting approach to shear measurement of Paper I 
was adopted, but extended to include multiple galaxy components, 
which we assumed to comprise a disk and bulge component. In re- 
ality, galaxies may have more complex morphologies than are de- 
scribable by this simple phenomenological model. In non-bayesian 
methods, models of greater complexity than req uired by the dat a 
lead to 'overfitting' and its associated biases (e. g lBemsteinll20 1 oh . 
In principle, a bayesian method could include additional compo- 
nents and model parameters, provided that prior probability dis- 
tributions are available, so that parameters not of interest to the 
shear measurement may be marginalised over. In practice, we wish 
to avoid using models of greater complexity than are required by 
the data, particularly as more complex models increase the compu- 
tational time required to make the measurements. In CFHTLenS, 
early e xperiments with t he sys tematics tests described in Section|7] 
and by iHevmans et alj ( |2012[) indicated that significant improve- 
ments could be obtained by extending the single-component model 
of Paper I to a simple two-component galaxy model in which only 
one additional free parameter was allowed. 



The generation of the galaxy models is described in Ap- 
pendix|A] We assumed that a galaxy survey may contain two fun- 
damental galaxy types. A fraction /b were bulge-dominated galax- 
ies with a single morphological component, which was assumed to 
be described by a de Vaucouleurs profile (Sersic index 4). A frac- 
tion (1 — /s) were disk galaxies which comprised both a bulge 
component and a disk component. The disk component was as- 
sumed to be a pure exponential, Sersic index 1, with the bulge 
component again having Sersic index 4. The model components 
were truncated in surface brightness at a major axis radius of 4.5 
exponential scalelengths (disk component) or 4.5 half-light radii 
(bulge component), which was both computationally convenient 
and also in general agreement w ith observations of disk galaxies 
(e.g. lvan der Kruit & SeMi3ll982l) . 

To avoid prohibitive computational cost by introducing an- 
other model parameter, which would not be analytically marginal- 
isable, the half-light radius of the bulge was set equal to the ex- 
ponential scalelength of the disk for the composite disk-dominated 
galaxies (i.e. the bulge half-light radius was 0.6 times the value of 
the disk half-light radius). In reality, galaxies display a wide range 
in their ratios of the sizes of these components, although that dis- 
tribution itselfde£ends_onthe_assumed surface brightness profiles 
adopted JOraham & Worlevl2008l) . The adopted ratio of bulge half- 
light radius to exponential scalelength is towards the upper end of 
the range encountered in the literature. As the CFHTLenS galaxies 
are generally only marginally resolved, smaller bulges would have 
half-light radii significantly smaller than the pixel scale, and the fi- 
nal PSF-convolved models were largely insensitive to the choice of 
this ratio, although we did find that significantly smaller values of 
bulge half-light radius led to larger PSF-galaxy cross-correlations 
(see Section|7]l for the smallest fitted galaxies, and were therefore 
disfavoured. 

Given the low signal-to-noise ratio and the small number of 
pixels associated with the faintest galaxies in the CFHTLenS data, 
a more complex set of galaxy models, with more free parameters, 
was not justified. We might ask, however, whether the particular 
models chosen are the 'best', in the sense of being the least biased, 
or whether some other choice might be better. To answer such a 
question, we need to have a good understanding of the actual pop- 
ulation of galaxies being measured. Recently, attempts have been 
made to create realistic simulations of t he galaxy population base d 
on high-resolution HST observations jMandelbaum et al] |2012|). 
and these form the basis of the GREAT3 challenge simulationO 
In the current paper, we take the view that the bulge-plus-disk pa- 
rameterisation is known to fit well to the local galaxy population, 
and therefore the surface brightness distributions of such models 
are already a necessary ingredient of any successful models. Even 
though high-redshift galaxies tend to have more complex mor- 
phologies than local galaxies, the effects of convolution with the 
PSF may render such differences unimportant in ground-based data 
of CFHTLenS quality, where the high-redshift galaxies are only 
marginally resolved. We discuss in Section [83] the possible ampli- 
tude of biases that might arise from inappropriate model choice. 
The requirements for larger, future surveys, with correspondingly 
tighter requirements on the acceptable level of systematic biases, 
will need to be established through more extensive simulations. 



greatSchallenge.info 
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3.3 The likelihood function and marginalisation over 
nuisance parameters 

We define the likelihood C and the goodness-of-fit statistic as 

2 



X 



-21og£ = ^ 



- SBg, - S{1 - B)f, 



(3) 



where yi is the image data value of pixel i, Oi is the statistical un- 
certainty of that data value, fi is the disk component model value 
for that pixel, convolved with the PSF, Qi is the bulge component 
model amplitude, convolved with the PSF, B is the bulge fraction 
defined as the ratio of the flux in the bulge component to the total 
flux (often written B/T) and 5* is the total galaxy flux. The pixel 
noise is assumed to be stationary and uncorrelated, which is ap- 
propriate for shot noise in CCD detectors in the sky-noise limit. 
Bright galaxies may make a significant contribution to photon shot 
noise, but the resulting non- stationary noise would then invalidate 
the /ewsfit Fourier-space approach which is described in Paper I and 
used below. Hence this algorithm is appropriate for model-fitting to 
faint galaxies in the sky-limited regime. The method can easily be 
generalised to the case where the noise is stationary but correlated 
between pixels, but we do not consider that generalisation here. 

The prior distribution of bulge fraction B for the disk galaxy 
population was assumed to be a truncated gaussian distribution, 

V{B) = Vo exp [- [B - Bof / (2aB^)] < B < 1 (4) 

where Vo was a normalising factor and Bq and ctb were param- 
eters of the prior distribution. The gaussian form of the prior was 
a convenient choice to allow easy marginalisation over B, given 
the quadratic dependence of £ on B. We adopted Bo = 0, ctb = 
0.1, /s = 0.1 as a reasonable representation of th e distribution 
of bu l ge fraction found a t low and high redshift (e.g. lSchade et al.l 
Il996lisimardetalj|2002h . We marginalised over the gaussian part 
of the prior by collecting terms in equation[3] and multiplying by 
the prior, writing the log of the posterior probability density for the 
disk population, logPdisk, as 

-21ogPdiBk = EyV'^' +252^2 + 2Bro 

+ s^E/V'^'-asE/y/'^' + BoK (5) 



where 
261^ 



5(E/y/'^'- 



-EsV-'- 

Effy/'^') 

-E/V-') 



Bo/4, 



and where the summations are understood to be over the pixels i. 
We marginalised over B to obtain the log of the marginalised disk 
posterior probability density as 

-2 logPdisk = S^j:f/a^ - 2Sj:fy/a^ - ^ 

+ 21oge-21og[erf -erf (g)] , (6) 

neglecting any normalisation terms that do not depend on a galaxy's 
model parameters. 

The bulge-fraction-marginalised total probability P = (1 — 
/B)Pdisk + /sPbuigG then needed to be marginalised over both 
flux and position. To marginalise over galaxy position at a given 
model galaxy flux, we followed an improved version of the method 
described in Paper I. For stationary noise, the terms E f '^ /'^'^ ^rid 
E /<^^ are invariant under shifts of the model position, and fol- 
lowing Paper I, the terms E fv/'^'^ and ^gy/cy^ may be regarded 
as the spatial cross-correlation of each model component with the 



data, so we may write 



(7) 



where the cross-correlations h / or hg depend on the vector posi- 
tion shift X of the model. As pointed out in Paper I, to marginalise 
over X we need to adopt a prior V{x\. a priori, the position of the 
galaxy is unknown, and is equally likely to exist at any position on 
the sky, so we adopt a uniform prior. Any other choice would lead 
to a bias in the position of the galaxy, and hence a bias in the mea- 
sured ellipticity, arising from the correlation between fitted ellip- 
ticity and position. However, as discussed in Paper I, £ — ^constant 
as I a; I oo and the marginalised likelihood would not be finite. 
This problem arises because, no matter how large a pixel value, it 
always has a finite chance of being due to random noise, with the 
true galaxy being positioned elsewhere. At large position offsets. 



X — >■ oo, hf,hg — ; 
towards a 'pedestal' 
where 

- 2 log Po,disk = 



and the posterior probability density tends 

value Po = (1 - /B)Po,disk + /sPo.bulge, 



2 log erf I e 



292 

261 



2 log £» 

-erfri 
29 



ro = 



'21ogPo, 



bulge — 



{j:.f9/a'-j:f/^')-Bo/ai, 



Thus we need to identify a maximum bound for the position offset 
over which to marginalise. 

As argued in Paper I, galaxies generally have smooth 
centrally-concentrated surface brightness distributions which are 
convolved with centrally-concentrated PSFs in an observed image 
(the assumption of a smooth convolved galaxy profile is a good ap- 
proximation for galaxies that are not well resolved). The model is 
also smooth, centrally concentrated and convolved with the same 
PSF. As in the derivation of the central limit theorem, such a cross- 
correlation should be well represented by a two-dimensional gaus- 
sian distribution, and hence we approximated log P as being a 2D 
gaussian added to the pedestal value logPo. To marginalise over 
the galaxy's position, we adopted a more exact integration of this 
form than assumed in Paper I, and integrated out to some maxi- 
mum position uncertainty, rmax. To establish the value of rmax, 
we considered that the only information that we have about the 
galaxy is on the images being analysed: the galaxy probably was 
initially detected, and its position measured, with some uncer- 
tainty, from the same data being used to measure the shape. We 
are seeking to use the data in this region of the image to measure 
the shape of a galaxy that exists at this location, not of any arbi- 
trary galaxy anywhere else in the image: although the likelihood 
is non-zero at large position offsets, such a galaxy would not cor- 
respond to the one which had been detected. Accordingly we set 
^max to be the position beyond which the detection of this galaxy 
becomes statistically insignificant, which we define to be given 
by AlogP — P(r-max) — Po < Axcrit/2 and where we chose 
^Xcrit ~ 6, corresponding to the 95 percent confidence region for 
the location of the galaxy. The integral is given by an exponential 
integral which was evaluated numerically using the GNU Scientific 
Library taalassi et ai]|2009l) , rather than using the approximation 
of Paper I. 

In Paper I, the flux marginalisation was carried out analytically 
by noting the gaussian dependence of P on S. For the CFHTLenS 
analysis, it was decided to adopt a prior distribution for S which 
was given by a power-law in S: 'P{S) oc 5*"" where a = 2 
was chosen to match the flatter-than-Euclidean slope of the faint 
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i-band galaxy number counts (e.g. iGabasch et al]|2008h . The re- 
sulting posterior probability distribution cannot then be analytically 
marginalised over 5, so this was achieved numerically, by evaluat- 
ing P at a set of points close to the expected maximum likelihood, 
fitting a cubic spline function to the variation of log P with S and 
numerically integrating the spline-fitted values of P. 

After these marginalisation steps, the marginalised likelihood 
had been reduced to being a function of ellipticity and galaxy scale- 
length alone. The final step was to adopt a prior for the scale- 
length distribution (described in AppendixiBt and to numerically 
marginalise over this parameter As for the flux marginalisation, 
log P was measured for a set of values of scalelength r, a cubic 
spline fitted to those values, and P was numerically integrated with 
respect to r. The set of r values measured included a point source, 
r = Q, and it was ensured that the set of log P values to be fit- 
ted varied smoothly towards r = 0. To distinguish between stars 
and galaxies, objects were classified using an F-ratio test of the ra- 
tio of the chi-squared values of the best-fit galaxy (r > 0) and 
the best-fit star (r = 0) models, taking into account the differ- 
ing numbers of degrees of freedom of these two fits. A star clas- 
sification was adopted if the probability of the F-ratio exceeded a 
magnitude-dependent limit that was matched to visual classifica- 
tion of objecto Of the objects that were fitted, typically 95% were 
classified as galaxies, as expected at faint magnitudes. 

3.4 Sampling the likelihood surface 

In order to obtain unbiased estimates of shear, and to measure the 
statistical uncertainty in ellipticity for each galaxy, we need to not 
only find the model ellipticity whose marginalised probability is 
a maximum, but also we need to sample around the maximum to 
measure the likelihood surface, at least in regions where the proba- 
bility is not insignificant. The ellipticity surface is two-dimensional, 
and bounded by the requirement jej < 1, so it is relatively straight- 
forward to sample the surface appropriately. 

However, galaxies of low signal-to-noise ratio or small size are 
expected to have broad likelihood surfaces, whereas high signal-to- 
noise galaxies should have sharp, well-defined surfaces. Thus the 
sampling needed depends on the properties of each galaxy. The 
algorithm adopted here was an adaptive grid sampling, in which 
ellipticity measurements were first made on a coarse grid of ellip- 
ticity, at intervals of 0.16 in ellipticity. At each sampled value, a 
full marginalisation over the other parameters was made. Then, the 
algorithm counted the number of sampled points above a thresh- 
old of 5 percent of the maximum posterior probability, and if that 
number was below a minimum value of 30 points, the sampling 
in ellipticity was reduced by a factor 2, and the region around the 
maximum, only, sampled at the higher resolution, until either 30 
points had been measured above the threshold, or until a resolution 
of 0.02 in ellipticity was reached (for the faint galaxies that dom- 
inate the weak lensing signal this limiting resolution in ellipticity 
was adequate, and is much smaller than the 'shape noise' caused 
by the broad distribution of intrinsic ellipticity). 

In a later improvement, two passes were made through the el- 
lipticity surface, the first to find only the maximum using the adap- 



^ A bayesian procedure that took account of the relative probabilities of 
stars and galaxies as a function of magnitude could avoid needing to specify 
the value of the F-ratio statistic, however given the veiy small numbers of 
stars expected in the faint galaxy, weak lensing, regime, i > 23. this was 
not deemed to be an important improvement in this survey. 



tive sampling algorithm, the second to map out the surrounding 
region. This algorithm was found to be more efficient (in that the 
fraction of sampled points that were retained above the threshold 
was higher) but was not implemented in time for the CFHTLenS 
analysis reported here. Monte-Carlo Markov Chain (MCMC) meth- 
ods could also be used, but given the low dimensionality and finite 
bounds of the parameter space, are unlikely to be significantly more 
efficient than this adaptive sampling algorithm. 

3.5 Shear estimation and noise bias 

Papers I & II discussed at length the advantages of bayesian esti- 
mation, but also noted that a bias would arise in shear estimates 
if the likelihood distribution for each galaxy were multiplied by 
an ellipticity prior before the estimation process. The existence of 
that bias demonstrates that a bayesian method of measuring the el- 
lipticity of an individual galaxy is not in itself a complete method 
for measuring shear in a sample of galaxies. A solution proposed 
at that time was to calculate the 'sensitivity', which measured the 
biasing effect on a shear estimator of applying an ellipticity prior. 
In Paper I, the sensitivity was defined as s = 9 (fia) /dga, a cali- 
bration of how the mean ellipticity of a sample {ca), for ellipticity 
component a, responds to a cosmological reduced shear Qc, in that 
component. Cross-terms were ignored. Analogous approaches to 
the sensit ivity, albeit not in this ba yesi an context, hav e been dis- 
cussed bv lBemstein & JarvisI ( |2002|) and lKaisej ( |2000|) . 

An immediate difficulty with this approach is that the sensi- 
tivity should be a rank 2 tensor, but in fact even if only the diagonal 
terms are considered, the sensitivity is not isotropic, and application 
of sensitivity as a correction can lead to an orientation bias in the 
measured shear As the sensitivity is calculated from the measured 
likelihood surface, and as those surfaces are biased (AppendixICll. 
poor results were obtained in CFHTLenS when this approach was 
used. 

An improved alternative, used for CFHTLenS, is a likelihood- 
based estimator of ellipticity, in which all nuisance parameters were 
marginalised over using a fully bayesian approach as previously 
described, but where the ellipticity estimate is based solely on the 
likelihood distribution rather than a posterior probability distribu- 
tion. It was argued in Paper I that such an estimator should be an 
unbiased estimator of shear, even though the statistical distribution 
of galaxy ellipticities is broadened by the measurement noise: i.e. 
the likelihood estimator has no correction for noise, but should re- 
spond linearly to a cosmological shear, when averaged over many 
galaxies (when ellipticity is defined as in Section [T2t . However, 
the non-linearity of the transformation from a pixel basis to ellip- 
ticity does result in a 'noise bia s' in shear estimates (Refreg ier et al.l 
l20I2l : lMelchior & Violj20l3) which is discussed below. 

We should also consider which estimator to use for the ellip- 
ticity value of each galaxy. If we consider the (simplified) shear 
measurement process of averaging galaxy ellipticities, where each 
galaxy has its own PDF of ellipticity, the PDF of the mean is given 
by the convolution of the individual PDFs, suitably rescaled. Ac- 
cording to the central limit theorem, the PDF of the mean should be 
gaussian, with expectation value given by the mean of the individ- 
ual galaxy PDF means. Thus to make a fully probabilistic estimator 
of shear, we should use the mean ellipticity as our estimate for each 
galaxy, not the maximum likelihood estimate. Unfortunately, such 
an estimator is biased, as discussed in AppendixICl For the case 
where the galaxy size is known precisely, the maximum likelihood 
ellipticity should be unbiased and may be a better estimator to use, 
but this is not true if the galaxy's size is a free parameter: there 
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Figure 2. An example of the distribution of lensing weights for 10,000 measured galaxies in one typical CFHTLenS field, Wlm2m3. Galaxies were measured 
on seven exposures to a limiting magnitude i' < 24.7. Weights are shown as a function of (left) apparent magnitude i, {centre) fitted semi-major axis disk 
scalelength r and {right) signal-to-noise ratio !^sN- 



is an inevitable degeneracy in the fits, between size and elliptic- 
ity, which caus es a furth er bias in the Ukehho od surfaces (see also 
iRefregieret ail 2012: Mel chior & Violj2012h . Unlike the case of a 
pure likelihood likelihood estimator, bayesian marginalisation over 
the size parameter mitigates this effect, because small galaxy sizes 
are downweighted by the size prior, but it does not eliminate it. A 
fully bayesian estimate of ellipticity should control the noise bias, 
but currently we do not have a satisfactory method of correcting for 
the bias on the shear introduced by the application of an ellipticity 
prior discussed as the start of this subsection. 

To partially address the problem of not being able to use a 
bayesian estimate for the ellipticity, we downweighted likelihood 
values at ellipticities where the bayesian posterior probability was 
low. We formed a posterior probability surface by multiplying by 
the ellipticity prior, and rejected any posterior probability values 
less than some threshold. The threshold was set to 1 percent of the 
maximum posterior probability. Applying our prior knowledge of 
the ellipticity distribution in this way allowed consistent rejection 
of the 'small, highly elliptical' tail on the likelihood surface de- 
scribed above. The galaxy's ellipticity was then estimated from 
the mean of the likelihood distribution of the ellipticity samples 
that were above the posterior threshold. This somewhat ad hoc ap- 
proach is not a substitute for a fully bayesian approach, but it did 
reduce the residual systematic signals discussed in Section|7] We 
still expect bias in the estimator at low signal-to-noise ratio, al- 
though the lowest signal-to-noise galaxies also had low weights in 
the analysis. An empirical correction for the noise bias was derived 
from the simulations described in Section[8] One might hope that, 
in future, a better ellipticity estimator could be constructed from 
the likelihood or bayesian posterior probability surfaces which is 
immune to noise bias and which leads to an unbiased estimate of 
shear. To date, we have not achieved such an estimator, but have 
found an approach that yields satisfactory results for CFHTLenS. 



3.6 Galaxy weighting 

Faint galaxies have more noisy ellipticity measures than bright 
galaxies, or, equivalently, have broader likelihood surfaces. Given 
that the full likelihood surfaces for the galaxies' ellipticities are not 
used in making cosmological shear estimates, each galaxy should 
be weighted according to the width of its likelihood surface in the 
cosmological analysis. We calculated an inverse-variance weight, 
defined as 



where is the ID variance in ellipticity of the likelihood sur- 
face, o-pop is the ID variance of the distribution of ellipticity of the 
galaxy population and Cmax is the maximum allowed ellipticity, as- 
sumed here to be the maximum disk ellipticity as in equation lB2l 
In the limit where emax — > oo, this definition of the weight tends 
towards a conventional form, w — >■ (ctc + o"pop)^^- A finite value 
of Emax was lucludcd to ensure that a galaxy with a likelihood sur- 
face which is uniform for e < fimax, and which therefore conveys 
no information about the galaxy shape, has zero weight. For a gaus- 
sian likelihood surface and prior (and in the limit emax — >■ oo), w 
has the same value as the sensitivity (Section [331 >, except for its 
normalisation (Paper I, Section 2.5). The expressions differ in the 
(actual) case of non-gaussian probability surfaces, but w has the 
advantage that it is not by definition anisotropic. In the limit of low- 
noise measurements, the weight becomes dominated by the 'shape 
noi se' of the galaxy populatio n, as expected for weak-lensing stud- 
ies (Bernstein & Jarvisll2002l) . Fig.[2]shows an example of the dis- 
tribution of weights in one of the CFHTLenS fields, as a function of 
i'-band magnitude, fitted galaxy major-axis scalelength, and signal- 
to-noise ratio, z/sN. The weight falls off sharply for i^sn ;$ 20. 
The maximum value of the weight is determined by the value of 
Cpop = 0.255, obtained from the ellipticity prior described in Ap- 
pendixlBl 

However, although the weight is designed to be isotropic, it is 
possible for there to exist a form of ' orient ation bias' similar to the 
selection bias discussed by I Kaiser! j2000l) and iBemstein & Jarvi^ 
i2002). In the case of the selection bias, we are concerned about 
galaxies being omitted from the sample if they are cross-aligned 
with the PSF. However, even for detected galaxies, the weight also 
may be larger for galaxies that are aligned with the PSF than for 
those that are cross-aligned, because the likelihood surface may be 
more sharply peaked. We did not make any attempt to correct ex- 
plicitly for this effect, but it may manifest itself as a correlation 
between weighted galaxy shapes and the PSF, as shown in Sec- 
tionlsn 



2cr2 



(8) 



3.7 The treatment of large, blended or complex galaxies 

The galaxy models were fitted to each galaxy in subregions ex- 
tracted around each galaxy ('postage stamps') of size 48 pixels 
(approximately 9"), this choice being a compromise between hav- 
ing a large enough region to correctly model the galaxies but small 
enough that the compute time for operations such as the fast Fourier 
transform was not prohibitively slow. If each galaxy had n useful 
exposures, then n postage stamps were created and models jointly 
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fitted, allowing for astrometric offsets and camera distortion as de- 
scribed in Sections|4]&|6]below. Inevitably, some galaxies had sizes 
too large to be fitted in this size of postage stamp; such galaxies 
were excluded from the analysis. 

In some cases, two or more neighbouring galaxies appeared 
within the same postage stamp. The algorithm can only fit one 
galaxy at a time, so the solution adopted was to first see whether 
it was possible to mask out one galaxy (set its pixel values equal 
to the background) without disturbing the isophotes of the galaxy 
being fitted. To this end, a co-added image postage stamp was cre- 
ated, averaging all the exposures available for that galaxy, shifted 
so the relative positions agreed to the nearest pixel, which was 
then smoothed by a gaussian of FWHM equal to that of the lo- 
cal PSF. Isophotes were created for each smoothed galaxy: if a 
separate galaxy or other object was identified with non-touching 
isophotes, at a level of twice the smoothed pixel noise, that other 
galaxy was masked out and the fitting would proceed. Such close 
pairs of galaxies are thus included in the output catalogues from 
CFHTLenS. We note, however, that low-level light leaking below 
the two-sigma isophote could still contaminate the measurement, 
and thus we expect the ellipticity measurements of galaxies in close 
pairs, whose isophotes may be contaminated by their neighbour, to 
be artificially correlated. 

Within each postage stamp, it may be that some pixels should 
be masked because of image defects. The THELI pipeline provided 
images of pixel masks to be applied. If such masked pixels occurred 
within the two-sigma isophote of a galaxy on one individual expo- 
sure, that exposure was not used in the joint analysis. If such pixels 
occurred outside the two-sigma isophote, the pixel values were set 
equal to the background and that masked exposure was used in the 
joint fitting. 

Other galaxies may be sufficiently close that their smoothed 
isophotes overlapped, and there may also be individual galaxies 
with complex morphology, not well described by a simple bulge- 
plus-disk model. These galaxies were identified using a deblend- 
ing algorithm, testing for the presence of significant independent 
maxima in the smoothed surface brightness distributioi{f|. Any such 
complex or blended galaxies that were found were excluded from 
the analysis. A further criterion was imposed, that the intensity- 
weighted centroid of a galaxy, measured from the pixels within the 
smoothed 2a isophote, should lie within 4 pixels of the nominal 
target position: this criterion guarded against any blended galaxies 
that had been identified as blends in the original input catalogue 
but that had not been identified by the other tests described in this 
section. Some examples of images of galaxies excluded by these 
criteria are shown in Fig.[3l which shows examples of the stacked, 
smoothed images used for testing for object complexity. Visual in- 
spection indicated that the great majority of galaxies excluded in 
this way had isophotes that overlapped with neighbouring galaxies. 

The fraction of galaxies that were excluded in this way varied 
somewhat between fields, as the criteria were affected by the size 
of the PSF. Typically, 20% of galaxies were excluded. Although 

^ The algorithm was similar to that of lBeard et alj il990l) . Maxima in the 
smoothed surface brightness distribution associated with the tai'get galaxy 
were identified, and regions 'grown' around those maxima by successively 
lowering a threshold isophote level from that maximum level. Pixels above 
the threshold were either identified with the corresponding maximum of any 
identified pixels that they touched, or otherwise were defined to be a new, 
secondaiy, maximum. Regions with fewer than 8 pixels were amalgamated 
into any touching neighbours. If multiple regions remained after this pro- 
cess, within the limiting 2cf isophote, the galaxy was flagged as 'complex' . 





Figure 3. Examples of four galaxies excluded from measurement by the 
criteria described in Section lXTl in field WlmOml. Each panel shows a 
coadded image 48 pixels (approximately 9") square, centred on each tai'get 
galaxy, and the inverted grey scale is linear up to some maximum value 
which varies between images. 

this fraction seems high, such a loss of galaxy numbers does not 
significantly degrade the signal-to-noise of the final cosmological 
analysis, but it does help ensure that galaxies whose measurements 
would be poor because of their size, or because they would be 
poorly modelled, have been excluded. These exclusion criteria are 
likely to introduce small-scale selection effects into the galaxy dis- 
tribution (e.g. neighbouring galaxies would have been classed as 
being blended with greater or lesser probability depending on how 
they were aligned with respect to the PSF) and so lensing signals 
on arcsec scales, < 5", should be excluded from analyses of this 
survey, even though nominal measurements are reported in the out- 
put catalogues. We note that the exclusion of some fraction of close 
pairs of galaxies may intr oduce a bias at a leve l of a few percent into 
cosmological parameters ( lHartlaDetall201ll) : we do not currently 
have any way to estimate the size of this bias in an actual survey 
such as CFHTLenS, without a detailed model of the true distribu- 
tion of galaxy pairs and of the effect of the measurement process 
on those pairs. 



4 OPTIMAL COMBINATION OF MULTIPLE IMAGES 

The algorithm pr esented in Papers I & II, and also t he simulations 
of the GREAT08 jSridle et alj201oh and GREAT 10 jKitching et all 
|20I2|) challenges, assume that each galaxy is measured on a single 
image. However, actual galaxy surveys use combinations of multi- 
ple exposures in the same waveband, or even across different filters. 
The reasons for having multiple exposures in the same filter are: (i) 
to increase the dynamic range of the observations; (ii) to prevent 
an excessive build-up of cosmic ray artifacts on any one image; 
(iii) to allow dithering of observations, filling in gaps where CCD 
boundaries or CCD artifacts prevent useful data being obtained and 
mitigating the effects of the finite pixel sampling. Thus any shear 
measurement method should make optimal use of such multiple 
images. In CFHTLenS typically seven dithered exposures were ob- 
tained in each field (Section|2j. 
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It is common practice for multiple images to be averaged to- 
gether to make a single image on which shear measurement is per- 
formed. This is problematic for high-precision galaxy shape mea- 
surement, for several reasons. 

(i) In order to make the averaged image, the individual images 
need to be first interpolated onto a common pixel grid. The inter- 
polation causes distortion of the PSF, but because of the slightly 
differing local plate scale between exposures, the interpolation ker- 
nel varies across the image on scales typically less than arcminutes, 
leading to patterns analogous to Moire fringes and introducing arti- 
ficial PSF variations on arcminute scales. These are difficult to cal- 
ibrate from star measurements owing to the finite density of stars 
in typical observations. Knowing the interpolating kernel, the ef- 
fect of the interpolation could in principle be corrected for at the 
location of each galaxy, but this is not a straightforward process. 

(ii) The interpolation introduces correlation between pixels into 
the noise, which is also problematic to correctly allow for in shear 
measurement methods. That noise correlation also varies on ar- 
cminute scales. 

(iii) Usually the PSF differs between each observation, so that 
averaging does not represent an optimal combination of images. 

(iv) The co-addition of PSFs of differing shapes and orientations 
may lead to a complex stacked PSF which might be more problem- 
atic to measure and model, depending on the method adopted. 

A likelihood-based method provides a straight-forward way of 
overcoming all four issues. The galaxies on each individual im- 
age can be fitted by models that have been convolved with the cor- 
rect PSF for that image, and the likelihoods simply multiplied to 
obtain the final combined likelihood. Images of poor quality, per- 
haps with a worse PSF, have a broader distribution of likelihood 
values than those of better quality, and hence the combined like- 
lihood represents an optimal combination of information. Because 
models are fitted to each individual image, there is no need for in- 
terpolation, avoiding the first two problems listed above. Tests of 
the lensfit algorithm on stacked images, not taking into account 
the above effects, produced significantly worse results, with large 
cross-correlation signals between galaxies and the PSF, than were 
obtained from the joint analysis of individual exposures presented 
here. 

In this method, the likelihoods are calculated on each image 
as a function of all the model parameters, including the unknown 
position of a galaxy. Optimum results are obtained by assuming 
that the galaxy's position should be the same, in celestial coordi- 
nates, on each image - provided the relative registration of the im- 
ages is known, the positions on each exposure should be the same 
and should not be treated as independent model parameters. This 
approach requires highly accurate knowledge of the relative astro- 
metric registration of the images, however, otherwise an artificial 
shear would be introduced which would be highly correlated be- 
tween neighbouring galaxies. A similar problem arises when av- 
eraging images into a single stacked image: astrometric errors de- 
grade the PSF. However, in that case, at least the PSF then is mea- 
sured from the stacked image, so the degraded PSF is, in principle, 
still correctly measured (unless astrometric errors vary on small 
length-scales across the image). In the case of the likelihood-based 
method, we do not have this luxury, because the PSFs are mea- 
sured on individual frames. Thus, it is essential to have an accurate 
measurement of the relative image registrations. Those registration 
shifts can then be applied to each of the models fitted to the individ- 
ual images. This process must be accurate to significantly less than 
one pixel, since astrometric errors are correlated over large scales 




Figure 4. Example PSF models derived in a typical CFHTLenS exposure: 
ID 720093 in field Wlp3mO. The upper panels show the model PSF sam- 
pled at three locations across the MegaCam field, with a linear greyscale 
varying from to 0.05, where the values are relative to the total flux in the 
PSF. Each panel is 32 pixels (approximately 6") square. The centre panels 
show the mean residuals, averaged over all the stars selected on that CCD, 
between the PSF model and the star images, with a linear greyscale vary- 
ing over the range ±5 X 10~*. No residuals are larger than 7 X 10~^ 
anywhere in these panels. The lower panels show the rms variation in the 
model, measured as described in the text, with a linear greyscale over the 
range (black) to 5 X 10~* (white), with maximum rms 5.6 X 10~^. The 
PSFs and the residuals are shown: (left) at the centre of the CCD in the high- 
est right ascension, lowest declination part of the MegaCam field; (centre) 
at the centre of the CCD immediately below the centre of the MegaCam 
field; (right) at the centre of the CCD in the lowest right ascension, highest 
declination part of the MegaCam field; 



across an image, and if uncorrected, would introduce an apparent 
shear. 

The way these relative astrometric registrations were mea- 
sured for CFHTLenS was, for each image, to cross-correlate the 
local PSF model with each star, thus measuring, from the maximum 
of the cross-correlation, the apparent location of each star with re- 
spect to its nominal celestial coordinate with maximum precision. 
A low-order polynomial function was then fitted to those registra- 
tion shifts across each CCD, and the shift function applied to the 
positions of the PSF-convolved galaxy models when fitting each 
exposure of each galaxy. A further sub-pixel correction was applied 
to allow for the fact that an individual galaxy's postage stamp was 
extracted from each exposure based on the nominal celestial coor- 
dinate (using the same coordinate transformation as for the stars 
used to create the PSF model), but was centred at an integer pixel 
value (i.e. no interpolation of data was done when extracting each 
postage stamp). The final shifts applied were always sub-pixel, and 
usually were less than one-tenth of a pixel. 

Similarly to the issue of celestial position of a galaxy, opti- 
mum joint analysis of multiple exposures should also assume that a 
galaxy has an invariant, but unknown, true flux, and that the relative 
measured flux on each exposure is related to the true flux through 
a known calibration relation. As the photometric calibration was 
carried out over many bright stars, we assumed the uncertainty in 
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the photometric calibration was negligible compared with the pho- 
tometric uncertainty associated with each image of an individual 
galaxy. Thus a galaxy's flux was treated as a single unknown pa- 
rameter to be marginalised over, using exposures that had the pho- 
tometric calibration applied to them. 

In principle, the same methodology could include exposures 
obtained in other filters, provided the filters are sufficiently close 
in wavelength that the same model provides a good description of 
each galaxy's structure at all the observed wavelengths. In that case, 
the galaxy's flux and bulge fraction values in each filter should 
be added as additional free parameters to be marginalised over. 
In CFHTLenS, only the deep, good-seeing i'-band exposures were 
used. 



5 POINT SPREAD FUNCTION ESTIMATION 

As in other lensing methods, we measured the PSF from images of 
stars on the same exposures as the galaxies being measured. The 
PSF was represented as a set of pixels at the same resolution as 
the data, and in the model-generation process, galaxy models were 
convolved with that PSF model. The PSF varied over the field of 
view: in MegaCam it varies both over the full optical field and also 
between the individual CCDs that comprise the full camera, and 
discontinuities in PSF are detectable between CCDs. 

To generate a pixel representation of the PSF, an initial cata- 
logue of candidate stars was selected from a stacked image of all 
exposures. Stars were selected using a conventional cut in the ap- 
parent size-magnitude plane, identifying by eye the locus of stars 
in the size-magnitude plane and selecting a box around that locus. 
The selection was further refined by applying a three-waveband gri 
selection. First, a stellar locus was defined in colour space by mea- 
suring the colour-space density of the input catalogue and rejecting 
any object which lay in a low-density region. Images of a subset of 
the resulting catalogues were inspected by eye and found to com- 
prise clean samples of stars (note that this procedure was relevant 
only for the generation of bright star catalogues for PSF creation 
and was not used for the classification of faint objects in the lens- 
ing measurement). Mo re details of the generation of the PSF star 
catalogues are given bv lHevmans et alj i 2012h . 

A pixelised PSF model was then created for each exposure 
by first centering each star according to a measurement of its cen- 
troid, then fitting each recentred pixel value as a two-dimensional 
polynomial function of position in the telescope field of view (the 
centering process is discussed below). For the interpolation in 
CFHTLenS, a third-order polynomial was found to produce good 
results on a random subset of the survey, with no improvement 
found using fourth-order, where the test used was to measure the 
residual cross-correlation between the PSF model and galaxy el- 
lipticities (Section|7}. To allow for the discontinuities in the PSF 
across the boundaries between CCDs, some coefficients of the 
polynomial could be allowed to vary between CCDs, with the re- 
maining coefficients being fixed across the field of view. Good re- 
sults were found allowing the constant and linear terms to vary in 
this way. 

The polynomial fits were constrained by the images of the 
stars in the input catalogue, where only stars with integrated signal- 
to-noise ratio larger than 20 were used, and where each star was 
weighted by the function Ws = J^sN,star/(2^SN,star + 50^), where 
J^SN.star was the star's integrated signal-to-noise ratio on that ex- 
posure and where the extra term in the denominator prevents any 
one very bright star from dominating the polynomial fit. Any star 



whose central pixel was more than 50% of the full-well limit for the 
CCD was not used, and any star which had either pixels that were 
flagged as not usable or pixels that appeared to be part of another 
object within a radius of eight pixels of the star centroid were also 
excluded. Pixels were deemed not usable either if they appeared to 
have been affected by a cosmic ray impact, if they were identified as 
being defective in a 'bad pixel' mask, or if they appeared to be af- 
fected by charge transfer from saturated stars, in the THELI pipeline 
(Section|2]l. Any individual CCD images that had fewer than 40 us- 
able stars were not included in the analysis, as it was found that 
smaller numbers of stars could produce PSF models that were not 
robust to a test of varying the stars selected. 

Star centroids were measured iteratively by cross-correlating 
with the PSF model and finding the centroid location. Star images 
were then interpolated so as to be centred on the pixel grid, us- 
ing sinc-function interpolation, and thus assuming the PSF to be 
a Fourier band-limited function (i.e. assuming there are no modes 
above the Nyquist limit of the data). As this assumption is unlikely 
to be valid, this is one of the main areas to be addressed in future 
improvements of the algorithm (see Section|9]l. 

The top panels of Fig.|4] show an example of the PSF pixel 
model created in one of the CFHTLenS exposures, sampled at three 
locations across the MegaCam field (two of the comers and near 
the centre). For comparison, the centre panels of the figure also 
show the mean difference between the model and the stars, aver- 
aged over all the stars on each CCD (80, 71 and 105 stars in each 
of the three CCDs shown). The lower panels show the rms variation 
in the model, measured by creating 64 bootstrap resamplings of the 
input star catalogue, recalculating the entire PSF model for each 
resampling, and measuring the rms variation in those mean resid- 
uals between the bootstrap-generated models and the stars from 
the original catalogue. The PSF model was fitted globally to all 
36 MegaCam CCDs simultaneously, and the values shown have 
been evaluated by averaging over all stars on one CCD. The images 
have been normalised so that the sum of values in the PSF is unity. 
Within the central region of the PSF the typical rms is~ 3 ^ 10~*. 
This test shows the reproducibility of the PSF model generation on 
the scale of each CCD, 6.3' x 14.3'. 

Fig.[5] shows an example of the PSF variation across one of 
the CFHTLenS fields. The quantities plotted are the PSF ellipticity 
(orientation and amplitude) and the fraction of light in the central 
pixel. These quantities have been averaged over 7 exposures. As 
well as a large-scale radial distortion, jumps in PSF are visible at 
CCD boundaries, especially the horizontal boundaries where there 
are two larger gaps between detectors, and where those boundaries 
appear ill-defined on the figure because of dithering of the 7 expo- 
sures. 



6 DISTORTION CORRECTION 

Wide-field cameras introduce distortion into the field-of-view 
which can mimic the cosmological shear signal, and this there- 
fore needs to be corrected for. Because field distortion may vary 
with telescope position and temperature, and because exposures 
are dithered, it should ideally be mapped and corrected on each 
individual exposure. The distortion information is implicitly avail- 
able in the astrometric calibration of the survey: provided the as- 
trometric solution has good absolute accuracy, the distortion can 
be measured simply by measuring the relationship between pixel 
and celestial coordinates as a function of position across the field. 
The centroid positions of stars, on which the astrometric solution 
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Figure 5. The variation of measured PSF over one of tlie CFHTLenS fields witli a liiglily variable PSF, Wlmlp2. The plot has been made by combining 
information from seven exposures, which are somewhat offset with respect to each other. PSF ellipticities have been simply averaged, (left) stick plot showing 
the variation of measured PSF ellipticity across the field. The maximum stick length on the plot corresponds to ellipticity of 0.13. (centre) grey scale showing 
the ellipticity amplitude, with black having an ellipticity of 0.16. (right) greyscale showing the fraction of the light in the central pixel of the PSF, with black 
indicating a fraction 0. 1 . 



is based, can generally be measured to subpixel accuracy, whereas 
the scalelengths of distortion variations are typically arcminutes or 
greater, so that the typical accuracy in distortion correction by this 
method is shear error of order 10~^ or better. 

Having measured the distortion on an image, this information 
was incorporated into the galaxy models by calculating the local 
linearised distortion transformation and applying that as an affine 
transformation to the models being fitted to each galaxy on each 
individual exposure. In other words, the effect of distortion was 
corrected by including it in the forward modelling of each galaxy 
on each exposure, rather than by trying to remove it from the data. 
Because the PSF models had not had distortion removed from them 
(the stars were not distortion-corrected on input to the models), the 
distortion was applied to the galaxy models prior to PSF convolu- 
tion, to ensure that the distortion and convolution corrections to 
the models correctly mirrored these effects in the data. In prac- 
tice, the models were fitted in Fourier space, so the affine trans- 
formation was also applied in Fourier space, using a simple ma- 
trix operation to convert the affine tr ansformation between real and 
Fourier space teracewell etaljl993l) . The affine transformation al- 
lowed for scale, rotation and shear distortion, and ensured that the 
ellipticities of all galaxies were measured with respect to the celes- 
tial coordinate system, and not any arbitrary pixel grid (i.e. galaxy 
position angle was defined with respect to the local north-south, 
east-west axes at each galaxy). When calculating shear correlation 
functions or power spectra from these data on large scales, it may 
be necessary to correct for the translation of the celestial coordinate 
system around the celestial sphere (i.e. there needs to be a 'course 
angle' correction to the differences in position angles of pairs of 
galaxies separated on large angular scales, when not working on 
the celestial equator). 



7 TESTS FOR SYSTEMATIC ERRORS 

We postp one detailed analyses for systematic biases to a compan- 
ion paper (Heymans et alj20"l^ . however in this section we discuss 
the chief effects which may bias the shear measurement results and 



illustrate the tests that we have carried out, showing specific exam- 
ples from individual CFHTLenS fields. 

There are a number of possible effects which could lead to the 
creation of systematic biases in the measured shear. 

(i) Inaccurate PSF modelling. 

(ii) Use of galaxy models and prior distributions for the model 
parameters that are not representative of the true population. 

(iii) Bias in the shear estimator (such as that described in Ap- 
pendixlct. 

(iv) Inaccurate relative astrometry. 

PSF modelling errors could be investigated by comparing the mod- 
els with the stars used to generate the models (i.e. a quantitative 
assessment such as shown in Fig.|4li: however, it is difficult to as- 
sess how large a deviation between model and reality can be tol- 
erated, what is really needed is an assessment on the effect of 
the final measured galaxy shear. S u ch an analys is has been pre- 
sented by IPaulin-Henriksson et a D ( l2008l I2OO9I) , giving a good 
qualitative picture of the effects expected from PSF modelling 
errors. It is difficult to make a quantitative prediction of the ef- 
fect on shear without repeating the analysis with realistic galaxy 
sizes and shapes, PSF surface brightness distributions that match 
the CFHTLe nS PSFs and likely PSF mode lling errors. However, 
the results of IPaulin-Henriksson et al.l 1 I2OO8 ) show that we expect 
modelling errors to result in significant cross-correlation between 
the PSF model and the galaxy ellipticities, and furthermore that 
the cross-correlation is expected to be a strong function of galaxy 
size. The use of incorrect galaxy models or prior distributions is 
likewise expected to produce size-dependent PSF-galaxy cross- 
correlation. Bias in the shear estimator is expected to produce PSF- 
galaxy cross-correlation with a strong signal-to-noise dependence 
(Fig.|Cl. 

To test for these effects, we first measured, in each individual 
CFHTLenS field, the cross-correlation between galaxy ellipticity 
and the mean ellipticity of the PSF model, measured at the loca- 
tion of each galaxy, with PSF ellipticity averaged over the expo- 
sures used. The cross-correlation is shown for three typical fields, 
as a function of signal-to-noise ratio j^sn and fitted galaxy size, 
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in Fig.|6] Statistical uncertainties were calculated by randomly di- 
viding the observed samples into ten subsamples, and measuring 
the variance in the statistic between the subsamples. The reported 
error is then vTO smaller than the subsample rms. However, on 
scales of one degree, there is a significant additional term that 
arises from the rand om cross-correlation o f the cosmic shear sig- 
nal with the PSF. In iHevmans et alj ( 1201 2 ) this term is explicitly 
estimated using shear derive d from the dark matter simulations of 
iHarnois-Deraps et alJ ( |2012|) . In this paper, this additional term is 
ignored. It manifests itself as an uncertainty, which is highly corre- 
lated between the plotted points, to be added in quadrature to the 
statistical uncertainties shown in Fig.|6] 

The test presented may not be sensitive to all conceivable 
forms of PSF modelling error. For example, the PSF model may 
be too small in some regions, leading to positive PSF-galaxy corre- 
lation, but too large in others, leading to negative PSF-galaxy cor- 
relation. Averaging over the field may lead to no net detection of an 
effect. However, we are not able to analyse the fields in smaller sub- 
samples, as the statistical and cosmic shear uncertainties become 
large. 

The results from the full-field analyses are generally encour- 
aging, with no obvious dependence of a cross-correlation signal on 
either signal-to-noise ratio or fitted galaxy size. The possible ex- 
ception to this statement is at the lowest fitted galaxy scalelengths, 
less than 0.7 pixels, where often a positive cross-correlation is ob- 
served. This is very likely due to reaching the limits of the sampled 
models that were generated as described in AppendixlAl It should 
not be surprising that systematic effects may be seen for galax- 
ies with major axis scalelengths smaller than one pixel in size, and 
considerably less than the PSF size. Fortunately, the number of such 
galaxies is relatively small, and they all have small weights (Fig.O. 

If there are residual errors in the relative astrometry, this would 
result in systematic shear patterns being induced with scalelengths 
of variation determined by the scalelength of the astrometric er- 
rors (probably many arcminutes). The resulting systematic signal 
would likely not be correlated with any other measurable quantity. 
We tested for this in a subset of CFHTLenS fields by switching off 
the relative astrometry correction described in Section|4l remeasur- 
ing those fields and generating shear-shear correlation functions. 
Only small differences, much smaller than the statistical uncertain- 
ties in the correlation function, were observed, indicating that for 
CFHTLenS the absolute astrometric accuracy already is adequate 
for shear measurement from individual exposures. 

An i n-depth analysis of the survey for systematic errors is pre- 
sented bv lHevmans et alj j2012h . 



8 CFHTLENS IMAGE SIMULATIONS 

The lensfit shape measurement algorithm has previously undergone 
a series of verification tests using simulated sheared images that test 
a ran ge of dif ferent observing conditions, galaxy and PSF types 
iKitching et al][2008) . These tests have used both sets of simula- 
tions from the Shear TEs ting Programme (STEP; .Hevmans et al.l 
12004 iMassev et alj|2007h . demonstrating sub-percent level accu- 
racy. The philosophy of lensfit, to analyse data from individual ex- 
posures rather than a co-added image, requires a suite of specialized 
image simulations that mimic this main feature of the CFHTLenS 
analysis. We describe in this section how these simulations com- 
pare to previous image analysis challenges and the subsequent ver- 
ification and calibration analysis. 

As well as confirming that the algorithms described above do 



work on multiple exposures, the simulations also allow us to mea- 
sure the calibration term that is required to correct for no ise bias, 
as disc usse d in Section[3^ in Appen dix[C) and bv Refregier et al.l 
( 1201 2[) and lMelchior& Viol j j2012l) . iKacprzak et al.l J2012h also 
describe the use of simulations for calibration of noise bias. 



8.1 GREAT CFHTLenS Image Simulations 

Our philosophy fo l lowed the strategy of the GREAT challenge^ 
( lBridleetalJl2O10l : iKitching et al1l2012h . to design the simplest 
simulation that nonetheless addresses the specific question: how ac- 
curate is lensfit when analysing realistic galaxy distributions, mea- 
sured with multiple, low-signal-to-noise exposures? We therefore 
chose a constant PSF across the field of view, and we postpone 
discussion of our tests on the robustness of the measurements to 
a sp atially varying PSF i n the actual data to the companion pa- 
per dHevmans et alj|2012h . In the simulations, we did, however, 
vary the PSF between the multiple exposures simulated for each 
field, based on the variation between exposures that we see in the 
CFHTLenS data. 

We simulated the full CFHTLenS area with each simulated 
one square degree field consisting of seven dithered exposures, with 
a linear dithering pattern that was matched to the average linear 
dither pattern in the survey, ignoring higher order astrometric cor- 
rections. Galaxies and stars were positioned within the field with 
the same distribution as in the real data and with matching magni- 
tude and signal-to-noise ratio. We then created a population of disk- 
plus-bulge galaxies such that the distributions of galaxy brightness, 
size, intrinsic ellipticity and bulge fraction matched the distribu- 
tions described in Section [X2l and AppendixlB] As in those models, 
the disk and bulge ellipticities were set equal. The unlensed orien- 
tation of each galaxy was random and each galaxy was assigned a 
cosmological shear based on its original position and redshift in the 
CFHTLenS da ta, as simulated by our N-b ody lensing simulation 
of CFHTLenS dHamois-Deraps et alj2012l) . In order to potentially 
allow reduction of the impact of noise from the intrinsic galaxy el- 
lipticity, we followed Massey et al. (2007) by creating two image 
simulations for each field, with the intrinsic ellipticities rotated by 
90° between the two images, although in practice we found that 
such shape noise mitigation was itself problematic and was there- 
fore not used, as discussed in Section [84l Camera distortions were 
not included in the simulations but the lensfit pipeline still applied 
a correction as it would on real data. 

The model prior distributions differed significantly from the 
distributions assumed in previous image challenges, as shown in 
Fig. [7] We simulated the galaxy population as comprising two 
types: 90 percent of galaxies were disk-dominated, the remain- 
der were bulge-dominated. No irregular galaxies were included. 
The upper panel compares the intrinsic ellipticity distributions in 
our custom CFHTLenS simulations to those simulated in the first 
STEP challenge and both GREAT challenges. It can be seen that 
the distributions of ellipticity for the disk-dominated population 
differ significantly. The bulge-dominated galaxy population ellip- 
ticity distribution was more similar to the ellipticity distribution 
assumed in previous challenges, although the fraction of bulge- 
dominated galaxies in the STEP simulations was negligible. The 
lower panel of Fig. |7] shows the size distribution for both popula- 
tions, which varied with galaxy magnitude in our custom simula- 
tions. This can be compared to the somewhat larger size distribution 
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Figure 6. Examples of cross-correlation between PSF model ellipticity and galaxy ellipticity in three CFHTLenS fields, from left to right, Wlm2pl, Wlp3mO 
and Wlp2m3. Upper panels show the cross-correlation as a function of signal-to-noise ratio fgNi lower panels as a function of fitted galaxy major-axis 
scalelength r^. Vertical bars indicate 68 percent confidence intervals. 



for the first STEP challenge, scaled according to the relative pixel 
sizes in STEP and CFHTLenS, and the set of three discrete galaxy- 
to-PSF size ratios simulated in the GREAT08 challenge, which for 
the average CFHTLenS PSF corresponds to galaxy scalelengths of 
r = [0.2, 0.3, 0.6] arcsec. Comparing these distributions indicates 
how challenging our custom simulations were compared to previ- 
ous image simulations, owing to the more realistic, broad elliptic- 
ity distributions and high fraction of small galaxies. We therefore 
would expect neither lensfA nor other methods to perform as well 
on these custom simulations as in previous challenges. 
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The image simulations were produced using a modified ver- 
sion of the GREAT challenges i mage simulation so ftware pack- 
age detailed in Appendix A.3 of Isridle et id] feoioh . Aside from 
the differences in galaxy populations described above, our simula- 
tions also differed from the GREAT08 and GREAT 10 simulations 
by allowing for overlapping object isophotes, by having each ex- 
posure containing both galaxy and stellar objects and by simulat- 
ing dithered data. We also modified the definition of galaxy size to 
be the size of the semi-major axis of a galaxy's isophotes, instead 
of the geometric mean of the semi-major and semi-minor axes. 
This redefinition is co nsistent with the co nvention used for mea- 
surements of data (e.g. lSimard et alJboOZ') and avoids introducing 
an incorrect correlation between size and ellipticity (see also Ap- 
pendixlAt. The definition of signal-to-noise ratio was chosen such 
that it was defined within a circular aperture of radius of 0.935" (5 
pixels) for all galaxies. Using a matched aperture flux, as measured 
by SExtractor on the data, then allowed us to directly replicate the 
signal-to-noise distributions within each CFHTLenS exposure. 

The choice of simulated PSF model was motivated by the data. 
To each of the 36 MegaCam CCDs in each CFHTLS exposure, we 

1 ' ' : 

fitted an elliptical-isophote Moffat ( 1969) profile to the images of 
stellar objects. A circular light profile at distance r from the centre 



3 



ou 



: — 1 1 1 1 1 1 


T 1 1 1 1 1 1 1 : 




i=24 

__i=23 - 

__.i=22 ; 

STEPl 1 




— ; 







0.5 1 1.5 



r (arcsec) 

Figure 7. Upper panel: the intrinsic ellipticity distribution for disk- 
dominated galaxies in the STEP (dotted curve), GREAT (long-dashed) and 
CFHTLenS (short dashed) simulations and for bulge-dominated galaxies 
in the CFHTLenS simulations (solid curve). Lower panel: the magnitude- 
dependent galaxy size distribution, where r is the major-axis half-light ra- 
dius for the bulge component and the major-axis exponential scalelength 
for the disk component. The CFTHLenS simulated distributions are shown 
at magnitudes i' = 22 (short dashed curve), i' = 23 (long dashed) and 
i' = 24 (solid). The distribution of the first STEP challenge is also shown 
(dotted curve). 
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Figure 8. The distribution of PSF parameters measured from tlie 
CFHTLenS data and replicated in tlie CFHTLenS simulations. The upper 
panels show the PSF ellipticity. The lower panels shows the PSF FWHM in 
arcsec (left) and the shape parameter /^Moffat (right). 



of a star was given by 

/(r) = /o 



1+ ( - 



-/^Moffat 



(9) 



where Jo is the peak flux, and the scale radius rs and the shape 
parameter /^Moffat were related to the FWHM of the PSF as 



FWHM = 2rs V2i/''Moffat _ 1. 



(10) 



We created a series of circular model PSFs on a pixel grid for a 
range of scale radii rs and shape parameters /3Moffat and applied 
a linear transformation of the co-ordinates to shear the model PSF 
to have elliptical isophotes. We then calculated the best-fit PSF el- 
lipticity and the Moffat profile parameters /JMoftat and a for each 
CCD chip. For each simulated exposure we adopted a PSF model 
for the full field of view by selecting a CCD PSF at random. This 
was to ensure that we fully represented the range of PSF models at 
all locations in the images, rather than taking the often more cir- 
cular average over the full field of view. We then grouped the PSF 
model exposures into fields following the grouping of exposures in 
the real data. The distribution of PSF parameters measured from 
the CFHTLenS data, and replicated in the simulations, is shown in 
Fig.i 



8.2 SkyMaker CFHTLenS image simulations 

When constructing calibration corrections, it is of crucial impor- 
tance to verify the base image simulations used to ensure the cal- 
ibration is not dependent on the simulation software used. To this 
end we created a second suite of 140 Mega Cam image simulations 
generated using SkyMakefl jSertinl 12009) . The SkyMaker PSFs 
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were not Moffat profiles, but instead had radial surface brightness 
profiles that took into account the telescope optics, atmospheric 
seeing and scattering into wings of the PSF, although those adopted 
here had circular symmetry. Unlike the GREAT simulations, only a 
single exposure was simulated, for reasons of computational time, 
and the signal-to-noise distribution of galaxies on those single ex- 
posures was matched to that of the CFHTLenS galaxies on the 
coadded data. The signal-to-noise ratios were generally smaller 
than in the SkyMaker simulations tested in the first STEP challenge 
jHevmans et alj[2006h . Furthermore, the size and ellipticity distri- 
butions were chosen to match those in the custom CFHTLenS sim- 
ul ations described above , unlike the SkyMaker simulations tested 
in iHevmans et id] i2006h . which follow the dotted curves in Fig. [7] 
The galaxy and star positions were chosen at random and circular 
PSFs with a IWHM of 0.7" were used. A cosmic shear signal from 
a redshift plane at z — 1 was assumed in the simulation. 

8.3 Model bias 

The primary aim of the simulations was to measure the bias caused 
by noise (Appendix C), and for this aim the simulated galaxies 
should match the real population as closely as possible. The strat- 
egy of matching the simulations to the models and priors used in the 
lensfit measurements is reasonable, as these distributions have been 
derived from observations and are therefore a good representation 
of the galaxy population. The tests presented here do not test for the 
effect of deviations of the true population from the model assump- 
tions, but they do test the efficacy of the method taking into account 
the implementation of the statistical algorithm in Section l331 in- 
cluding the effects of convolution with appropriate PSFs, the effects 
of finite signal-to-noise and noise bias, the combination of multiple 
dithered images and the effect of masks applied to the data. 

The difficulty of attempting also to test for the effect of devia- 
tions of the true population from the assumed population is that we 
have little prospect of obtaining an independent assessment of how 
different the real Universe might be from our assumptions, even at 
the level of knowing the distributions o f surface brightness of in - 
dividual galaxies in the local universe jGraham & Worlevll2008h . 
Notwithstanding that difficulty, this qu estion has been considered 
by other authors. I Voigt & Bridld ( I2OI0I) considered the multiplica- 
tive biases that might arise from fitting multiple gaussian compo- 
nents to exponential or de Vaucouleurs surface brightness distribu- 
tions, convolved with a PSF. It was found that fitting a single gaus- 
sian to an exponential distribution caused a multiplicative bias of 
3 percent and an additive bias of ~ 0.001, but that these biases fell 
to 0.3 percent and < 10~* when two gaussian components were 
adopted. Larger biases were found when fitting gaussians to a pure 
de Vaucouleurs profile (3 — 6 percent and 0.001 respectively, when 
fitting two gaussians), but this is not surprising given the signifi- 
cant difference in surface brightness profile between gaussian and 
de Vaucouleurs distributions. We note that pure de Vaucouleurs- 
profile galaxies are expected to be a small fraction of the true pop- 
ulation. Multiplicative biases of 2 — 5 percent were found when 
fitting two gaussians to composite galaxy profiles comprising ex- 
ponential plus de Vaucouleurs components: in this case we expect 
our fitting of exponential plus de Vaucouleurs models to work well 
and therefore to lead to sig nificantly smaller biases. 

IVoigt & Bridld ( l2010h argue that these biases arise because a 
gaussian profile is too flat to reproduce well the central regions 
of realistic galaxy profiles. In this paper, we fit two components, 
exponential plus de Vaucouleurs, to the real population of galax- 
ies, and these should be far better representations of the true sur- 
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face brightness distributions than multiple gaussians, with suffi- 
cient flexibility to model well a wide range of Sersi'c indices. We 
the refore expect model b ias to be significantly smaller than found 
bv lvoigt&Bridi3 Jioioh when fitting gaussians, and to be sub- 
dominant compared with noise bias, as evaluated in the following 
sections, in the CFHTLenS analysis. 

Future surveys covering larger sky areas will have more strin- 
gent requirements on the acceptable level of systematics than 
CFHTLenS, and it is likely that the question of model bias will 
need to be revisited for those surveys. 

8.4 Analysis and Calibration 

We processed both sets of image simulations with lensfA using the 
known galaxy and star positions as an input catalogue. Fig.[9]shows 
the measured galaxy size, signal-to-noise ratio, bulge fraction and 
ellipticity distributions from both sets of image simulations and 
the CFHTLenS data. In all cases the distributions incorporate the 
weights assigned by lensfA (equation[8}. The simulated signal-to- 
noise distribution shown in the upper left panel was constructed 
to match the CFHTLenS depth. We found reasonable agreement 
between the ellipticity distribution measured in the data and simu- 
lations, although in the weighted distributions of CFHTLenS there 
is an excess of galaxies at jej ~ 0.45. We do not expect perfect 
agreement between the measured distributions and priors, because 
only the summed posterior distribution is expected to match the 
prior distribution (Paper I), whereas Fig.|9] shows the values ob- 
tained from the expectation values of the probability distribution 
(after marginalising over the other parameters). The excess is only 
present in the weighted distribution, and not in the unweighted dis- 
tribution, which implies that, on average, galaxies with non-zero, 
middling ellipticities have more sharply defined likelihood surfaces 
than their peers at lower or higher ellipticity. This effect is not fully 
understood, but it is probably linked to the issue of noise bias that 
is discussed in Appendix|C] In comparison to the data, in the sim- 
ulations there are also lower fractions of small galaxies for which 
an ellipticity was determined. As these small galaxies do exist in 
the simulation (see Fig.|7]l this might indicate a mismatch between 
the PSF in the data and simulations, such that small galaxies were 
harder to measure in the simulations than in the real data. Either 
pixelisation and noise from the data led us to overestimate the PSF 
size which was then used in the simulation, or the Moffat profile 
used in the GREAT simulations and the SkyMaker PSF were poor 
models of the MegaCam PSF. There are, however, sufficient num- 
bers of small galaxies to calculate a calibration as described below, 
and as these small galaxies were assigned the lowest weights by 
application of equation[8l it should not be problematic that their 
calibration was based on relatively fewer galaxies than for other 

sizes in the data^ 

We follow iHevmans et al. in modeling our shape mea- 

surement as a linear combination of a multiplicative error m and an 
additive error c such that 

6°*^" = (1 + m)e"^"'= + c + Ae (11) 

where e°^^ is the observed complex ellipticity as measured by 
lensfA, Ae is the noise on the measure and 6*"^"° is the sheared 
intrinsic ellipticity defined as in equation[T] and for simplicity we 
treat m as an orientation-independent scalar. At this point we are 
faced with several alternatives to determine m and c. Comparing 
6°'"' directly with 6*"^"° would lead to a strong noise bias in the 
measurement of m and c, as the observed ellipticity is bounded 
with je°'"^| < 1 such that (Ae) is non-zero for a galaxy with 
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Figure 9. Comparison of the weighted distributions of measured signal-to- 
noise ratio (upper left), galaxy size (upper right), ellipticity distributions 
(lower left) and bulge fraction (lower right) from the custom GREAT image 
simulations (crosses), custom SkyMaker image simulations (circles) and the 
CFHTLenS data (solid). 

non-zero ellipticity. Comparing e°'^'' directly with shear 7 would 
also lead to an intrinsic ellipticity noise bias, as for a finite number 
of galaxies (6'"^"°) may be non-zero. This is particularly relevant 
when galaxies are binned in order to look for size- and signal-to- 
noise-dependent calibrations, increasing the random shot noise in 
each bin. Massev et al. (2007) developed a method to circumvent 
this issue by including rotated pairs of galaxies in the simulations 
such that (e'"') = by construction, where e'"' is the intrinsic 
ellipticity prior to application of any lensing shear. We generated 
such galaxy pairs in the custom GREAT simulations, but found 
that, when galaxies within a pair were assigned different weights, 
which at the low signal-to-noise regime was typical, (lue'"') was 
significantly non-zero. Thus we did not use this method of shape- 
noise cancellation. Instead, our preferred analysis method was to 
first bin galaxies by size r and signal-to-noise ratio i^sN, such that 
there were equal numbers of objects in each bin, and then to sub- 
divide those bins by the input cosmic shear of each galaxy. Each 
galaxy was entered into two bins of shear, treating each component 
as being an independent measure of shear for the purposes of de- 
riving the calibration correction. Then, a calibration was derived by 
minimizing given by 

= X] ~2 {^'i'^ - [1 + m(!/SN, r)] ^f^" - C(!/SN, r)^ , 

i 

(12) 

where for each bin i 

-obs _ 1] j "'i i'^aj ~ ^aj ) 

and where for galaxy j in bin i, j is either the ei or 62 component 
of the ellipticity, with a = 1,2 being set by the component of the 
input cosmic shear for that galaxy in that bin. 

The error on ^f°^ was estimated through a bootstrap er- 
ror analysis within each bin i. was found to be uncorrelated 
between bins. The summation over both ellipticity components in 
equation[T3]was motivated by an initial component-dependent min- 
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size, given by 





Figure 10. The weighted average true intrinsic ellipticity (toe'"*) as a func- 
tion of the PSF ellipticity e* for four sets of GREAT-simulated galaxies 
binned by galaxy size and signal-to-noise ratio. Small galaxies with major- 
axis scalelengths r < 0.3" are shown in the upper panels; large galaxies 
with r > 0.3" are shown in the lower panels. The left-hand panels show 
the faint sample with i/gN < 20; the brighter sample with i/gN > 20 are 
shown on the right-hand panels. Each bin contains roughly the same number 
of galaxies. 



imization, which showed negligible differences in the calibration 
measured for each ellipticity component. 

This analysis method ensured that we were unbiased to mea- 
surement or intrinsic ellipticity noise, but it does disguise biases in- 
troduced by the weights. In Fig.[TO]we use the GREAT CFHTLenS 
simulations to show the weighted average true intrinsic ellipticity 
(we'"') as a function of the PSF ellipticity e* for four sets of galax- 
ies binned by galaxy size and signal-to-noise ratio. For each galaxy 
sample we find a preferential weighting of galaxies oriented in the 
same direction as the PSF. This weight bias is independent of which 
ellipticity component we choose and affects bright large galaxies 
to a similar extent to small, faint galaxies. It is an undesirable out- 
come of the weighting scheme, which correctly identifies that it is 
easier to measure the shapes of galaxies oriented with respect to 
the PSF in contrast to galaxies oriented perpendicular to the PSF, 
and may be viewed as a form of the orientation bias discussed by 
iBemstein & Jarvii ( |2002| . Section [X6l l. We investigated isotropisa- 
tion schemes to remove the PSF-related weight bias, by assigning 
galaxies an averaged weight based on its size relative to the PSF 
size and the signal-to-noise ratio of the galaxy. We found how- 
ever that the isotropisation reduced the effectiveness of the weights, 
which also decrease the noise bias in the shear measurement (Ap- 
pendixlct. and resulted in an even stronger weight bias in the op- 
posite sense. We therefore concluded that weight bias, which im- 
pacts shape measurement at the 10^'^ level, is an undesirable fea- 
ture of applying inverse-varian ce weighting, but is a small effect 
for the study of CFHTLenS. In lHevmans et alj j20I2i) we describe 
a method to identify and remove fields which show any significant 
correlation with the PSF, which can arise both from this weight bias 
and from the general effect of noise bias. 

Fig-El shows some example results of the measured multi- 
plicative error m(i'sN, r) in the GREAT simulations as a function 
of i/'sN for four size bins selected to be the smallest, and the three 
discrete galaxy sizes simulated in GREAT08. The solid curve is a 
fit to the full data set, which comprises 20 bins in each of r^sN and 



■ exp (—or !/sN ] 



(14) 



with a = 0.057 and /? = —0.37. We find that for all galaxies with 
r > 0.5", or vsn > 35, m is consistent with zero on average, 
but that the calibration correction increases for smaller and lower 
signal-to-noise galaxies, as expected for noise bias (AppendixICt. 
The lensfit weighting scheme already downweights these noisy 
galaxies, such that even though the calibration correction can be 
significant, these galaxies contain very little weight in our scien- 
tific analyses. This can be seen in the left panel of Fig.[T2l which 
shows the multiplicative calibration for each bin of signal-to-noise 
ratio and size as a function of the average weight of the galaxies 
in the bin, for both the GREAT and SkyMaker simulations. It can 
be seen that the net effect is very similar in the two sets of simula- 
tions, although the SkyMaker simulations display a slightly larger 
bias than the GREAT simulations for galaxies with low signal-to- 
noise ratio and thus lower lensfil weight. As the GREAT simu- 
lations have anisotropic PSFs that closely match the CFHTLenS 
PSFs, and as they incorporate the effect of having seven multiple 
exposures, whereas the SkyMaker simulations are single exposures 
in each field, we use the results from the GREAT simulations to 
calibrate the CFHTLenS shear measurements. The differences be- 
tween the results from the simulations only become apparent for 
galaxies which are significantly downweighted in CFHTLenS, and 
overall it is encouraging to note the close match between the two 
sets of simulations. The right panel of Fig.[T2]shows how the mean 
applied calibration correction (rn) varies with the photometric red- 
shift of the galaxies in CFHTLenS, showing primarily the larger 
correction that needs to be applied at redshifts Zp > 0.8 owing to 
the decreasing signal-to-noise ratios and sizes of galaxies at higher 
redshifts. We note also the larger correction deduced for galaxies 
at low redshifts, Zp < 0.3, which probably arises from contamina- 
tion of the low-reds hift population by gala xies with higher spectro- 
scopic redshifts (see lHevmans et alj20I2l for further discussion). 

We find no evidence for a significant additive error c term 
in the lensfil analysis of the simulations, indicating that the mul- 
tiplicative error arising from noise bias is the dominant term to be 
corrected. 



8.5 Simulated Cosmic Shear Tomography 

To test the application of such a calibration, we estimated the 
calibration-corrected two-point shear correlation function (other 
papers will describe the correction to estimators such as the three- 
point function and others). We first calculated the uncalibrated two- 
point function. 



[et{xi)et{xj) ± er{xi)er{xj)] 



(15) 



where the weighted sum is taken over galaxy pairs with 
\xi — Xj\ —9. The two-point correction term was then estimated 
as 



i + K{e) = 



(16) 

for the same galaxy pairs, thus taking into account potential astro- 
physical correlations between galaxy size, brightness and cluster- 
ing scales. The calibration corrected two-point function was then 
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Figure 11. Example results of the measured multiplicative error m(i^st.j, r) 
in the GREAT simulations as a function of signal-to-noise ratio for four size 
bins selected to be the smallest scale length (0.1", upper left), and the three 
discrete galaxy sizes equivalent to those simulated in GREAT08 (0.2, 0.3, 
0.6 arcsec). The solid curve is the best fitting functional form to the full 
data set which comprises 20 bins in both signal-to-noise ratio and size. The 
lower two panels show m(t'sj>i, r) as a function of r for two bins of i^s^. 
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Figure 12. (left) The multiplicative calibration m for each signal-to-noise 
ratio and size bin as a function of the average lensfii weight (to) of the sim- 
ulated galaxies in the bin. Results are shown from the GREAT (crosses) and 
the SkyMaker (open circles) simulations, (right) The mean applied multi- 
plicative calibration (m) for CFHTLenS galaxies binned by photometric 
redshift Zp . Vertical bars indicate the rms of the distribution of m values 
within each bin. 
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Figure 13. Comparison of the measured to input two-point correlation func- 
tion for the raw uncalibrated data (circles) and the calibrated data (triangles) 
from the GREAT CFHTLenS simulations. The error bars are representative 
of the signal-to-noise ratio expected for a CFHTLenS-like survey. To inves- 
tigate the impact of the calibration as a function of redshift we show the 
signal for three tomographic configurations; 0.1 < 2 < 1.3 (upper panel), 
0.5 < 2 < 0.85 (middle panel) and 0.85 < 2 < 1.3 (lower panel). 



given by 



1 + K{e) 



(17) 



The alternative of applying the calibration correction directly to in- 
dividual galaxies could result in instability in the rare events where 
1 + m(z/sN,i, — >■ 0. In addition, using this method automat- 
ically removes any unwanted correlation between the calibration 
correction and intrinsic ellipticity which could arise from the mea- 
sured galaxy signal-to-noise ratio or size being weakly correlated 
with shape, which is a possible outcome of any shape measure- 
ment method. We show a comparison of the measured and input 
two-point correlation functions in Fig.[T3] for the raw uncalibrated 
data and the calibrated data. The error bars are representative of the 
signal-to-noise ratio expected for a CFHTLenS-like survey and the 
data points are strongly correlated. To investigate the impact of the 
calibration as a function of redshift we show the signal for three 
tomographic configurations; a full sample with .1 < z < 1.3 
similar to the sample used in lKilbingeretai] ( l2012h and two tomo- 
graphic bins with 0.5 < z < 0.85 and 0.85 < z < 1.3, similar to 
the two tomographic bins used in Benjamin et al. (in preparation). 
Note that a very low redshift bin with 0.1 < z < 0.5 has statistical 
error bars that encompass the full figure. Fig.[T3] shows that for a 
CFHTLenS-like survey the correction is well within the statistical 
errors. We do see that the calibration correction is more significant 
for the higher redshift sample, as expected given that the average 
size and signal-to-noise ratio decreases with redshift. 

We determined the systematic error contribution to the 
calibration-corrected two-point function £,'±^{0) as follows. First, 
we took the likelihood distribution of the multiplicative calibra- 
tion correction parameters in equation [141 constrained by a fit 
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Figure 14. Confidence regions of the multipficative calibration coirection 
parameters in equation [T4l used to infer systematic error contribution to the 
calibration corrected two-point function 

of the calibration correction to the simulated data. This likelihood 
distribution is shown in Fig.[T4]with the 68%, 90% and 95% con- 
fidence intervals. We randomly sampled this distribution such that 
we had a sample of A'^ = 100 {a, j3) values that fully represented 
the probability of the fit (i.e 68% of the pairs lay within the la 
contour). We then calculated the two-point correction term Kn (9) 
for each of the A'^ resampled values and determined the variance of 
the calibration-corrected two-point function (9) between the TV 
differing corrections. This method allows us to isolate a systematic 
error covariance matrix for the two-point shear correlation func- 
tion, allowing marginalisation over th e uncertainty on the calibra- 
tion correction. lKilbingeretarH2012h show that this has a negligi- 
ble effect on cosmological parameter estimates from the two-point 
shear correlation function. The concept can be easily modified to 
cover any number of the weak lensing statistics that will be used to 
analyse CFHTLenS. 



9 CONCLUSIONS AND FURTHER DEVELOPMENTS 

We have presented the method of shear measurement d eveloped for 
CFHT Len S, based on the earlier lensfit algorithm of iMiller et al.l 
( I2OO7I) and lKitching et al.l jlOOSh . The method fits PSF-convolved 
galaxy models to multiple galaxy images to estimate galaxy lensing 
shear. Significant enhancements have been made to the basic algo- 
rithm in order to create a method that can be applied to the survey, 
as follows. 

(i) Creation of pixel models of the PSF on individual exposures, 
allowing for spatial and temporal variation of the PSF and for dis- 
continuities between detectors in the imaging camera. 

(ii) Use of two-component models, disk plus bulge, with a prior 
distribution for bulge fraction estimated from published work. 

(iii) Measurement of likelihood from multiple dithered expo- 
sures, optimally combining information when the PSF differs be- 
tween exposures, and automatically allowing any one galaxy to 
have differing numbers of useful exposures, where that number 
may vary because of the effect of detector and field boundaries 
coupled with the dither pattern. The multiple exposures are nei- 
ther interpolated nor co-added, avoiding systematic effects that are 
otherwise introduced by those processes. 

(iv) Measurement of camera distortion from astrometric mea- 
surements, and correction of fitted models to optimally remove the 
effects of distortion from the shear measurement, rather than cor- 
rection of the data using interpolation (i.e. interpolation of data is 
avoided). 



(v) Use of realistic priors on galaxy size, ellipticity and bulge 
fraction, based on published HST and other data. 

(vi) Improved algorithms for bayesian marginalisation over 
galaxy position, size, flux and bulge fraction. 

(vii) Modification of the shear estimator from a posterior proba- 
bility estimator to a likelihood estimator, to avoid anisotropy and 
bias caus ed by implementat ion of the sensitivity correction de- 
scribed bv lMiller et aljJioOT ^. but at the cost of needing to calibrate 
the effect of noise bias on the estimator. 



We regard the issue of making measurements from non- 
interpolated individual exposures as being particularly important. If 
images are interpolated and co-added, the resulting non- stationary 
convolution kernel that has been introduced can have a severely 
detrimental effect on lensing shear measurement. In principle, the 
varying convolving kernel could be evaluated at each galaxy lo- 
cation and a correction applied to the PSF, but working from indi- 
vidual exposures avoids that complexity. Furthermore, the resulting 
co-added PSF is likely to have even more complexity than the PSF 
on individual exposures, and in general would be expected to lead 
to co-added PSFs with 'twisted' isophotes (ellipticity and centroid 
varying as a function of distance from the peak of the PSF), which 
would cause problems for many published methods of shear mea- 
surement. As multiple exposures are usually dithered, and as the 
PSF may be discontinuous between detectors (as in MegaCam) and 
as there are gaps between detectors, any co-added PSF inevitably 
has discontinuous variations across the blurred regions between de- 
tectors in the dithered images. The 'gap' regions are not generally 
sufficiently large to allow independent measurement of the PSF 
from stars in those regions, so either such regions would need to 
be excised, or the PSF would need to be measured on individual 
exposures and the local PSF reconstructed appropriately. Excision 
of such problematic regions would lead to significant reduction in 
area for surveys with a large dither pattern. These problems are all 
avoided with the joint-fitting approach described here. 

We have tested the method using two, independent, extensive 
simulations of CFHTLenS based on the GREAT and SkyMaker 
image simulation codes, but with galaxy parameters more closely 
matched to those found in the faint galaxy populations than have 
been assumed in previous lensing challenges. The faint galaxy pop- 
ulation has galaxies of smaller size and a broader range of elliptic- 
ity than has been assumed in the STEP and GREAT challenges, 
which leads to a degradation in performance compared with what 
would have been naively expected from those simulations. When 
creating image simulations to test shear measurement methods, it 
is important to include real observational problems such as those 
described in this paper, and also to ensure that the simulated galaxy 
population mimics closely the observed distribution. With the cur- 
rent generation of surveys, we have only been able to match the 
coarse features of the population, namely size and ellipticity, and 
even there uncertainties remain over issues such as the redshift 
distribution of galaxy properties. Future, large-area surveys which 
aim to achieve significantly lower systematic uncertainties than in 
CFHTLenS should be concerned about our current lack of knowl- 
edge in this area. 

Finally, the issue of noise bias has been discussed by a num- 
ber of recent authors, and with current estimators we do need to 
calibrate the noise bias from realistic simulations. One hope for the 
future is that a properly-formulated bayesian estimator may be able 
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to rigorously correct for noise bias without the need for calibration 
simulations. 
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APPENDIX A: GALAXY MODELS 

Galaxy models were created for CFHTLenS according to two basic 
types, bulge- or disk-dominated, as described in Section [T2l Thus 
two surface brightness profiles were assumed, for each of the bulge 
and disk components of galaxies. For computational speed, these 
were two-dimensional distributions with radial surface brightness 
dependencies having Sersic index equal to either 1 or 4, which 
were given elliptical isophotes by a 2D shear transformation. The 
transformation was defined so that the semi-major axis of a model 
component was invaria nt (unlike, for example, the GREAT08 o r 
GREATIO simulations, iBridle et alj l20ld (Kitching et alj lioTj) . 
The rationale for this choice was that the ellipticity of disk galax- 
ies is determined primarily by the inclination angle of the disk to 
the line of sight, in which case the semi-major axis is the best es- 
timate of an inclination-independent intrinsic size of the galaxy. 
When defining a prior based on the distribution of intrinsic size of 
galaxies, that prior should be applied to the distribution of semi- 
major axes. 

Exponential and de Vaucouleurs surface brightness distribu- 
tions are sharply peaked towards their centres, so if the sur- 
face brightness distributions are generated on a coarse pixel 
grid, the central pixel may become incorrectly dominant. The 
problem is worse for the de Vaucouleurs profile, p(r) oc 
exp(— 7.67(r/rh)^''''): the central pixel is a factor 2143 brighter 
than pixels at r = rh. A partial solution to the problem is to gener- 
ate surface brightness distributions on a finer-resolution grid, how- 
ever for moderate oversampling factors (a factor 1 1 oversampling 
was adopted for CFHTLenS), even that is not good enough. For 
CFHTLenS this problem was overcome by applying an integral 
constraint, requiring the sum of the model pixel values to equal 
the expected integral under the de Vaucouleurs profile, adjusting 
the central pixel value so that this condition was met. An alterna- 
tive approach would be to use the 'photon-shooting' monte-carlo 
method employed in the GREAT simulations. 

Model components were convolved with the local PSF pixel- 
basis model at the location of each galaxy being fitted. Because 
both the galaxy model and the PSF contain Fourier modes above 
the Nyquist sampling limit of the observed image, we should seek 
to generate models at high resolution, convolve with a similarly 
high-resolution PSF, and only after that convolution, downsample 
to the observed sampling (noting that the downsampling should de- 
pend on the galaxy's precise location). However, as the PSF had to 
be estimated from the stars on each exposure, it was not straightfor- 
ward to generate reliable models with high-frequency modes cor- 
rectly represented. In principle, such high-frequency information 
may be recoverable from the data because stars are observed at a 
variety of positions with respect to the pixel grid, however the PSF 
also varies between stars, and each star is itself a noisy measure 
of the downsampled PSF, so it was decided not to attempt to re- 
construct oversampled PSF pixel-basis models. In effect, the PSF 
models adopted had their high-frequency modes aliased into the 
range of observed frequencies. In an attempt to make an approxi- 
mate correction for that effect, an approximate PSF model was cre- 
ated, assuming circular isophotes, which comprised a Moffat pro- 
file fitted to the PSF near the centre of the field of view. The high- 
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resolution galaxy models were first convolved by the Moffat model, 
before being downsampled. This downsampled galaxy model was 
then convolved with the actual PSF pixel-basis model for each 
galaxy, but with a multiplicative Fourier-space correction applied 
to remove the effect of the Moffat model. In the limit of band- 
limited data, an exact convolution by the true PSF model would 
have been achieved by this process. For the actual data, which may 
contain higher-frequency modes, some level of correction for those 
modes was obtained by this approximate method, but clearly a fu- 
ture improvement would be to reconstruct a high-resolution PSF 
pixel-basis model from the data and deal with the high-frequency 
modes more rigorously. 

It is worth noting that, because the observed stars have been 
convolved with the detector's pixel response function (usually as- 
sumed to be a top-hat function), the PSF model employed already 
included the effect of the mean pixel response. No further averag- 
ing across pixels was done. 

A further problem with generating surface brightness dis- 
tributions from sheared 2D distributions is that disk galaxies 
change their surface brightness profiles with inclination angle: 
edge-on galaxies have a surface brightness distribution along 
their minor axes which are best described as a sech^ function 
jvan der Kruit & Searlel[T98lh . This is thought to be partly a re- 
sult of the distribution of stars perpendicular to the disk and partly 
a resu lt of obscuration associated with the disk dOraham & WorlevI 
120081) . In order to attempt to address this, in one set of tests, the 
2D disk models were replaced by a 3D model in which the model 
surface brightness had a sech'^ distribution perpendicular to the 
plane. Obscuration was assumed to be distributed the same as the 
stars (probably an oversimplification of the true obscuration), and 
the projected 2D distribution was obtained by integrating through 
the 3D di stribution. The obscuratio n was set to match the fits re- 
ported by iGraham & WorlevI J200^ . However, when the tests for 
systematics, described in Sectionl?] were made, there appeared to 
be no significant improvement in the results. Inspection of typical 
PSF-convolved 3D models showed that they were indistinguish- 
able from 2D models (once allowance fo r the apparent increase in 
observed scalelength had been made, see iGraham & Worlevll2008l) 
and it was concluded that the extra computational expense of the 
3D models was not justified. The likely reason for the lack of dis- 
tinction between 2D and 3D models is that, in the CFHTLenS data, 
most galaxies are only marginally resolved, and the semi-minor 
axis in particular is much smaller than the size of the PSF. This 
test did, however, provide some reassurance that the results would 
not be sensitive to this aspect of the galaxy models. 

The 2D models of disk-dominated galaxies that were used 
here had the ellipticity and orientation of the bulge component set 
equal to that of the disk. The 3D models that were also tested had 
realistic ellipticity gradients, caused by the bulge component being 
less flattened than the disk. The lack of sensitivity to the 2D or 3D 
galaxy structure arises because in the CFHTLenS data most galax- 
ies are only marginally resolved, and hence the bulge ellipticity has 
little influence on the overall measurement, which is dominated by 
the more extended disk component. In future surveys with better 
resolution, such as th e Eucliqj missio n, ellipticity gradients must 
be taken account of l lBemsteirill2oTol) and more complex galaxy 
models will likely be needed. 



APPENDIX B: PRIOR DISTRIBUTIONS 
Bl Galaxy scalelength 

The disk and bulge scalelength priors are derived from the fits 
to HST WFPC2 o bservations of galaxies in the Groth strip by 
ISimardetal1 ( l2002l) . shown in Fig.[T]for the disk-dominated galax- 
ies. The relation between the log of the median major-axis scale- 
length, Td and i-band magnitude appears linear. A least-squares fit 
to the median values in the range 18.5 < 1814 < 25.5 yields 

log^ [rd/arcsec] = -1.145 - 0.269(^814 - 23). (Bl) 

In applying this relation to CFHTLenS we assumed i^u = icfht. 
Those measurements do not strongly constrain the functional form 
of the scalelength prior: we adopted a prior of the form 

p{r) oc r exp(— (r/a)") 

where the value of a was adjusted to match the magnitude- 
dependent median of the lSimard et akl fits. The value of a was also 
not strongly constrained by those data, so we adopted a value in 
the middle of a range that appears to give a good representation of 
the distribution, a = 4/3. With this choice of a, a = r(i/0.833. 
One concern about the use of the WFPC2 data might be that large, 
low-surface brightness galaxies may have been missed owing to the 
limited surface brightness sensitivity of the observations. While we 
cannot exclude the possibility that some galaxies may be missed, 
we note that the WFPC2 data extend to significantly fainter galaxy 
magnitudes than CFHTLenS (Fig.[T) which mitigates against the 
possible loss of low surface brightness galaxies at the CFHTLenS 
depth. The use of the median galaxy size also guards against the 
loss of low surface brightness galaxies. HST observations are cur- 
rently the only way of reliably probing the size distribution of 
galaxies on scales ~ 0.1". 

The distribution of bulge scalelengths (or equivalent half-light 
radii) is more problematic. In principle, we need to consider the 
disk- and bulge-dominated populations separately. The sizes of 
bulges assumed in the models for the disk-dominated galaxies have 
been discussed in Section [T2l The distribution of sizes of bulge- 
dominated galaxies might, in principle, be obtainable from the 
ISimard et all ( l2002h analysis, however these galaxies are a minor- 
ity of the population, and given the degeneracies that exist between 
scalelength, ellipticity and surface brightness profile, in noisy data, 
it is difficult to obtain a robust estimate of the size distribution with 
present data. In the local universe, half-light radii of the most lu- 
minous early-type galaxies are about 50 percent larger than those 
of late-type galaxies, but at the galaxy luminosities appropriate 
for CFHTLenS, M ~ —19, they are about a factor 2 smaller 
jShen et al .l2003h . The distribution of early-type scalelengths is not 
well-known, but we note that the details of the early-type prior are 
not too important as they only account for about 10 percent of the 
galaxies. We therefore assign the same distributions to the half- 
light radius of the bulge-dominated population as to the exponen- 
tial scalelength of the disk-dominated population, so that the half- 
light radii of early-type galaxies ar e 0.6 times t he hal f-light radii of 
late-type galaxies, consistent with lShen et alj ( l2003h . Future work 
should seek to refine our knowledge of these distributions, particu- 
larly beyond the local universe. 



B2 Galaxy ellipticity 
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Fits to faint galaxies do not strongly constrain the distributions of 
galaxy ellipticity, so we make the assumption that faint galaxies 
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have the same distribution of disk ellipticity as low-redshift galax- 
ies of the same luminosity in SDSS: disk ellipticities are deter- 
mined largely by the disk's inclination, thickness and optical depth 
at the waveband of observation, the latter being luminosity depen- 
dent jUnterbom & Rvderill2008[) . Accordingly we fit the elliptic- 
ity distribution of 66762 SDSS disk-dominated galaxies selected 

I ~ " ' 

from DR7 ( lAbazaiian et a l. 2009) to have g-band absolute magni- 
tude Mg —19, to correspond in luminosity and waveband to 
galaxies selected at i' ^ 24.5 in CF HTLenS, for which the m edian 
redshift is estimated as z ~ 0.66 l lHildebrandtetal.ll2012h . The 
functional form adopted was 



Ae 1 



p(e) 



exp 



'-]) 



(l + e)(e2 + e2)i/2 



e < en 



(B2) 



where e = |e| and where the maximum ellipticity cutoff fimax ~ 
0.804, obtained from fitting to the distribution, arises primarily be- 
cause of the finite thickness of galaxy disks. The circularity param- 
eter eo = 0.0256, also obtained from fitting, reflects the intrin- 
sic non-circular symmetry of face-on disk galaxies: even face-on 
galaxies only rarely appear to have circular isophotes. The fitted 
value for the dispersion was a = 0.2539. The normalisation pa- 
rameter A was determined numerically. 

In principle, the ellipticity distribution should be a function of 
galaxy luminosity. However, the prior distribution only has any sig- 
nificant effect for the lowest signal-to-noise galaxies: high signal- 
to-noise galaxies have well-defined likelihood values and the pos- 
terior probability distribution is largely unaffected by the prior. Ac- 
cordingly we chose a galaxy luminosity appropriate for the faintest 
galaxies in CFHTLenS selected at the median redshift, and fixed 
that prior distribution to be the same for all galaxies. 

A prior for the ellipticity distribution of bulge-dominated 
galaxies is not readily obtained. In the absence of better infor- 
mation, we adopted a fit to the ellipticit y distribution of bulge- 
dominated galaxies in lSimard et al. U2003) " of the form 



p(e) oc eexp [—fee — ce^ 

where the best-fit values were b = 2.368, c ; 
normalisation was determined numerically. 



(B3) 

6.691 and where the 



B3 Galaxy bulge fraction 

Likewise, the bulge fraction is not well-constrained by current 
data, and even in the nearby universe the bulge fractions deduced 
for b right gala xies are de pendent on the models used to fit them 
JCraham & WorievI |2008|) . A reasonable distribution to assume, 
an d which is also cons istent with fits to faint galaxy samples such 
as [Schade et alj h996h , is that there are two populations. Bulge- 
dominated galaxies were fit by purely de Vaucouleurs profiles, 
and account for around 10 percent of the galaxy population. Disk- 
dominated galaxies were assumed to make up the remainder, and 
we assumed the bulge fraction / = B /T to have a truncated nor- 
mal distribution in / with its maximum at / = and with o — 0.1, 
truncated so that ^ / < 1. The bulge fraction was marginalised 
over as described in the main text. The bulge fraction prior was 
independent of other parameters. 



B4 Prior iteration 

In Paper II we proposed an iterative scheme to improve the accu- 
racy of the prior probability distributions using the observed sample 
of galaxies. In the STEP simulations the procedure worked well. 




Figure Bl. The ob.served distributions of axis ratio (upper) and ellipticity 
(lower) for disk galaxies with Mg < —19 in SDSS, compai'ed with the 
model function adopted for the CFHTLenS ellipticity prior (solid curves). 



However, when applied to the CFHTLenS data, the iterations did 
not readily converge. It appeared that the main cause of this prob- 
lem was that in CFHTLenS, unlike STEP, the galaxies are only 
marginally resolved, so that there are strong degeneracies between 
size, ellipticity and surface brightness profile. Given that the data 
are also noisy, it is not surprising that the iterations tended to con- 
verge to arbitrary solutions. 

Fortunately for CFHTLenS, the available HST and SDSS data 
were sufficient to achieve adequate estimates of prior distributions. 
In future surveys such as Euclid, it may be necessary to use the sur- 
vey itself to define the prior distributions. However, in that survey, 
galaxies are expected to be rather well resolved, and it is likely that 
iterative improvement of the prior would be effective. 



APPENDIX C: 
ESTIMATOR 



BIAS IN THE MEAN LIKELIHOOD 



Here we discuss the expected bias in the mean ellipticity mea- 
sured for galaxies for a simplified example of a galaxy with a 
gaussian surface brightness profile convolved with a gaussian PSF. 
Because of the unrealistic gaussian surface brightness profiles, the 
results calculated cannot be transferred to the actual CFHTLenS 
analysis: however, the gaussian model provides a useful illustra- 
tion of the origin of bias in the mean ellipticity estimate. This 
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wor k was done independen tly of that of iRefregier et all ^20121) 
and lMelchior & Violal ^2012*) but reaches similar conclusions. The 
work described here investigates particularly the bias on the mean 
likelihood estimator of ellipticity, rather than the maximum like- 
lihood estimator, and discusses also the correlation with the PSF 
that is induced by the presence of noise. Rather than calculate bias 
in the gaussian model using a Fisher matrix approach, we explic- 
itly calculate the shape of the likelihood surface for some simple 
examples. 

An elliptical object may be viewed as a sheared version of an 
isotropic object, and we define the isotropic coordinates x'i of pixel 
i in terms of the observed coordinates Xi as x'i = Ax,;, where 



— 62 



-62 

1 + ei 



(CI) 



and where ei and 62 are the two components of ellipticity. 

We define a gaussian surface brightness profile as /(xi) = 
exp [— ixf C^^Xi] where the pixel surface brightness covariance 
matrix is given by 



C ■- 



(1 



1 + 2ei + 

262 



262 

1 - 261 + 6^ 



(C2) 



where s is a size parameter and e = |e| = \/ e\ + 6|. 

If a gaussian galaxy with pixel surface brightness covariance 
matrix Cg is convolved with a gaussian PSF with covariance ma- 
trix Cp then the resulting surface brightness distribution is gaussian 
with covariance matrix Cq + Cp. 

If we fit a model surface brightness distribution, to some 
pixelated data /, the log(likelihood) of the fit may be written as 



log£ = 



1 [E,/rM 



(C3) 



where the sum is over pixels i, the background noise has rms a 
and is assumed stationary, the model galaxy's amplitude has been 
marginalised over and an arbitrary additive constant has been ne- 
glected (Paper I). 

If the data corresponds to some gaussian galaxy con- 
volved with a gaussian PSF, with independent, stationary 
gaussian noise added to each pixel, we may write fi = 
A/'exp [— ixf (Cg + Cp) ""^ Xi] + rii for pixel i, where M is 
a normalising constant and Ui is a gaussian random variate rep- 
resenting the pixel noise. Neglecting the bias caused by the term 
Ui, the galaxy's profile-weighted signal-to-noise ratio is given by 
/ a which allows us to define the normalisation 



constant as 



2\l/2 



(C4) 



7ri/2|CG-hCp|i/4' 

Hence the log(likelihood) of the data given some model galaxy 
with covariance matrix Cm, convolved with the PSF, may be writ- 
ten 



+ ni ) e 2 



10g£: 



{Cm+Cp)- 



2o-2 ^g-xf (Cm+Cj,)-1x, 



(C5) 

and so the expectation value of the log(likelihood) for this particu- 
lar model M is 



{log£) = 



2-kN''\{{Cg + Cp)-^ + [Cm + Cp)-^y 



a2 1 Cm + Cp 11/2 



+ 



^2J^JM2 



(C6) 
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Figure CI. Contours of log^, C for the example gaussian model and PSF 
described in the text. The cross at ei = 0.3, 62 = denotes the ellip- 
ticity of both the test galaxy and the maximum likelihood. The cross at 
ei = 0, 62 = 0.3 denotes the ellipticity of the PSF. Contours are plotted at 
intervals of 0.5 in log;, C below the maximum, for a galaxy with signal-to- 



noise ratio i^sN = 20. The contour interval scales as ui 



SN- 



where fl' = e-^^'^'fC^M+Cp) ^x, ^ q 

use the identity (A"^ + B~^)~^ = A{A + B)'^B in the first 
term. The second term has a value of unity, since ^ [E fi 'ni 

""^ E ' ignoring this irrelevant additive constant, we ob- 
tain 



(log£) = 



27r AA^ j Cg + Cp 1 1 Cm -HCp I 

<721Cg + 2Cp+Cm| 
2i/|n|Cg + Cp|^/"|Cm + Cp|^/" 



Cg + 2Cp + Cm 
= 2uI^\{Cg + Cp)-Y^'|(Cm + Cp)-Y^' 

\{Cg + Cp)-^ + {Cm + Cp)-^\. (C7) 

We can repeat this calculation for a set of models M, each 
with a different ellipticity or size, and hence build up the expected 
likelihood surface as a function of model ellipticity. Fig. lC II shows 
contours of (log^ C) for an example where the galaxy ellipticity 
was eg = {0.3, 0} and the PSF ellipticity was epsp = {0, 0.3}, 
with size cr = 3 pixels for both galaxy and PSF. The choice of 
PSF ellipticity is extreme, much larger than the PSF ellipticity in 
CFHTLenS, to better illustrate the bias. Although the location of 
the maximum is unbiased, the likelihood distribution has a sub- 
stantial asymmetric distortion, both towards e = and towards the 
PSF ellipticity, which results in the mean ellipticity being biased. 
2 In these illustrations, the size of the model galaxy has been held 
_ fixed (i.e. the galaxy size is assumed to be known). If this assump- 
tion is relaxed, and the galaxy size is also treated as a free param- 
eter, the likelihood surface changes shape, with a bias in both the 
mean and the maximum likelihood, and an excess of likelihood at 
extreme ellipticity can b e produced (see also Refregier et al. 201^; 
iMelchior & Violal2012h . We note that the application of a bayesian 
prior would suppress that excess likelihood, but we do not consider 
that effect here. 

The upper panel of Fig. |C2| shows the shear bias, plotted as 
1 -\- m, expected in this simple model, for fixed galaxy size, for a 



© 201 1 RAS, MNRAS OOP. [711241 



24 L. Miller et al. 





'SN 



Figure C2. ( Upper panel). The expected shear bias, 1 + m, arising from the 
bias in the mean elUpticity for the gaussian model described in the text, as 
a fimction of signal-to-noise ratio i^sN- {Lower panel). The expected PSF- 
galaxy cross-correlation signal, ^psF- galaxy, in th^ absence of shear, as a 
function of i^sN- 



simplified case where a population of randomly-oriented gaussian 
galaxies, all with intrinsic ellipticity eg = 0.25 and constant shear 
g = {0.05, 0}, have been measured with a gaussian PSF, the same 
size as the galaxy, with epsp = {0, 0.1}. The lower panel shows 
the expected cross-correlation between the galaxy and PSF. It can 
be seen that both measures show significant departures from zero 
bias at low signal-to-noise ratio, j/sn 20. In this model, even 
larger biases are seen for gaussian galaxies that are smaller than the 
PSF. In CFHTLenS, many galaxies are smaller than the PSF, how- 
ever the significant difference in surface brightness profiles of both 
the galaxies and the PSF mean that these results cannot be trans- 
ferred quantitatively to the actual data. Our approach, described in 
Section[8] was instead to measure the effects from realistic simula- 
tions designed to match the data. 



